Bryostatins, bryopyrans and polyketides: compositions and methods

ABSTRACT

The present invention recognizes that marine organisms comprise nucleic acid molecules that encode polypeptides that catalyze the synthesis of bioactive compounds, such as polyketides including bryopyran rings, such as bryostatins. One aspect of the present invention is a composition including at least one nucleic acid molecule that encodes at least one polypeptide that catalyzes at least one step in the synthesis of at least one polyketide such as a bryopyran ring, wherein said at least one nucleic acid molecule is derived from at least one marine organism. A second aspect of the present invention is a composition including a library of nucleic acid molecules of the present invention. These nucleic acid molecules can be used in a combinatorial biosynthesis of polyketides, bryopyran rings and bryostatins.

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims the benefit of priority of U.S. Provisional Patent application No. 60/147,283 to Haygood et al., filed Aug. 4, 1999, which is incorporated by reference herein in its entirety.

STATEMENT OF GOVERNMENT RIGHTS

[0002] This invention was made at least in part with government support awarded by the National Institutes of Health (NCI/NIH 2R44-CA58158-02A3 to CalBioMarine Technologies, Inc.) and California Sea Grant (RMP-61). The United States Government may have certain rights in the invention.

TECHNICAL FIELD

[0003] The present invention generally relates to polyketides, including bryopyran rings, such as bryostatins, and methods of making polyketides, bryopyran rings and bryostatins.

BACKGROUND OF THE INVENTION

[0004] Polyketide synthases (PKS) are enzymes that catalyze the synthesis of polyketides, a class of compounds that have diverse activities, including a variety of bioactivities such as anticancer and immunomodulatory activity. Polyketides are created by the sequential condensation of acetate or other simple fatty acid units in a manner analogous to fatty acid synthesis. There are two types of cyclic polyketides, complex and aromatic, where bryostatin are classified as complex polyketides. PKS enzymes are classified as Type I (PKS-I), having multiple active sites on a single polypeptide, or Type II (PKS-II), having singe active site polypeptides that form a complex.

[0005] Bryostatins are a set of bioactive complex polyketides based on a bryopyran ring structure whose synthetic pathways have evaded elucidation. Bryostatins are found in invertebrates of the genus Bugula, such as Bugula neritina, and are believed to give the marine invertebrates a competitive advantage in the environment due to their toxicity. This toxicity also makes the bryostatins attractive pharmaceutical agents. Bryostatins, such as bryostatin 1, have been used extensively in clinical trials for the treatment of a variety of cancers, carcinomas and lymphomas. However, the only source for bryostatins is from natural collections or aquaculture. These sources can be unpredictable due to environmental conditions, such as El Nino events, and overharvesting of wild populations. Thus, there exists a need for a reliable and economical source for bryostatins.

[0006] The present invention addresses this and other needs by identifying and characterizing genes that are involved in the synthesis of polyketides such as bryopyran ring structures, such as bryostatins, and expressing these genes in heterologous organisms. These genes can be used to produce base structures, such as bryopyran rings, that can form the basis of combinatorial chemistry to produce a wide variety of compounds, including those made using combinatorial biosynthetic procedures. These compounds can be screened for a variety of bioactivities including anticancer activity. The present invention provides related benefits as well.

SUMMARY OF THE INVENTION

[0007] The present invention recognizes that marine organisms contain nucleic acid molecules that encode polypeptides that catalyze the synthesis of bioactive compounds, such as polyketides including bryopyran rings, such as bryostatins.

[0008] One aspect of the present invention is a composition including at least one isolated nucleic acid molecule that encodes at least one polypeptide that catalyzes at least one step in the synthesis of at least one polyketide such as a bryopyran ring, wherein said at least one nucleic acid molecule is derived from at least one marine organism.

[0009] A second aspect of the present invention is a composition including a library of nucleic acid molecules of the present invention. These nucleic acid molecules can be used in a combinatorial biosynthesis of polyketides, bryopyran rings and bryostatins.

[0010] A third aspect of the present invention is a composition including at least one isolated polypeptide that catalyzes at least one step in the synthesis of at least one polyketide such as a bryopyran ring, wherein said at least one polypeptide is derived from at least one marine organism.

[0011] A fourth aspect of the present invention is a composition including a library of polypeptides of the present invention.

[0012] A fifth aspect of the present invention is a method of making a composition, including providing at least one nucleic acid of the present invention and synthesizing at least one polyketide such as a bryopyran ring.

[0013] A sixth aspect of the present invention is a composition made using a nucleic acid of the present invention.

[0014] A seventh aspect of the present invention is a method of making a composition including providing at least one polypeptide of the present invention and synthesizing at least one polyketide such as a bryopyran ring.

[0015] An eighth aspect of the present invention is a composition made using a polypeptide of the present invention.

[0016] A ninth aspect of the present invention is a method for identifying at least one nucleic acid molecule encoding at least one activity of a PKS including contacting a nucleic acid molecule of the present invention with a sample, and identifying nucleic acid molecules in said sample that hybridize with a nucleic acid of the present invention.

[0017] A tenth aspect of the present invention is a nucleic acid molecule identified by a method of the present invention.

[0018] An eleventh aspect of the present invention is a composition comprising a library of nucleic acid molecules of the present invention.

[0019] A twelfth aspect of the present invention is a method for identifying a bioactive compound including contacting a compound made or identified using a nucleic acid molecule of the present invention and determining the bioactivity of the compound.

[0020] A thirteenth aspect of the present invention is a method for identifying a bioactive compound including contacting a compound made or identified using a polypeptide of the present invention and determining the bioactivity of said compound.

[0021] A fourteenth aspect of the present invention is a preparation of bacteria from a Bugula that include PKS genes.

[0022] A fifteenth aspect of the present invention is a polyketide, bryopyran ring or bryostatin present in Bugula pacifica.

BRIEF DESCRIPTION OF THE FIGURES

[0023]FIG. 1 depicts structures of illustrative bryostatins.

[0024]FIG. 2 depicts reactions catalyzed by type I Polyketide synthase EryA.

[0025]FIG. 3 depicts an expected domain structure of a bryopyran synthase and predicted structures.

[0026]FIG. 4 depicts a method for expression of bryostatin synthase genes in S. venezuelae.

[0027]FIG. 5 depicts a method for intermodular PCR for PKS genes.

[0028]FIG. 6 depicts PCR amplification products of various Bugula, including B. neritina from California and North Carolina using KSa PCR primers of the present invention.

[0029]FIG. 7A to FIG. 7D depict the effects of treatment of B. neritina with gentamicin and the production of bryostatins.

[0030]FIG. 8 depicts competitive PCR analysis of DNA preparations from B. neritina tips. Two different DNA preparation were subjected to competitive PCR using a clone of KSa with an internal deletion. Samples were electrophoresed on a 1.5% agarose gel. Lanes 1 and 14 are the “1 Kb ladder” molecular weight marker from Bethesda Research Labs. Lanes 2, 3, and 4 are 10⁻⁴, 10⁻⁵, and 10⁻⁶ dilutions, respectively, of competitor DNA alone. Lane 5 is an unfractionated DNA prep with no competitor, lanes 6-8 are 10⁻⁴, 10⁻⁵, and 10⁻⁶ dilutions, respectively, of the competitor DNA mixed with the unfractionated DNA prep. Lane 9 is amplification of the fractionated DNA prep without competitor, and lanes 10-12 are fractionated 10⁻⁴, 10⁻⁵, and 10⁻⁶ dilutions, respectively, of the competitor DNA mixed with the fractionated DNA prep. Lane 13 is a control of B. neritina larval DNA with no competitor. Equivalent amounts of amplification comparing the competitor and full-length product are visible in the 10⁻⁵ dilution in lane 7, and the 10⁻⁴ dilution in lane 10. This indicates that the representation of E. sertula DNA in the fractionated DNA prep was 10-fold higher the unfractionated sample.

[0031]FIG. 9 depicts PCR screening of cosmid clones with KSa-specific primers. PCR was performed on DNA isolated from the clones listed above each lane, using KSa-specific primers, and the products run on a 1.0% agarose gel. Mkr is a 1 Kb ladder, and Cnt. is a control with no added DNA.

[0032]FIG. 10 depicts EcoRI restriction digest patterns of cosmids 2A, 3A, 4A, and 6A. Digests were run on a 0.6% agarose gel. Mkr is a 1 Kb ladder, clones are indicated above the lanes.

[0033]FIG. 11 depicts hybridization of 6A T3-end probe to EcoRI/SalI digests of cosmids 5A, 5B, 3A and 6A.

[0034]FIG. 12 depicts relative location of cosmid clones in relation to PKS gene cluster.

[0035]FIG. 13 depicts a clone and sequencing map of PKS cluster region. The positions of clones 3A, 6A, 5A, and 5B are indicated by lines and identified to the right of the lines. Regions sequenced on each clone are denoted by horizontal lines above or below the clone line. For 3A, lines above and below the line indicate that the complete sequence has been obtained on both DNA strands. Regions in the other sequences have been determined mostly on a single strand, and although some sequence on both strands may be present, it is not denoted. T3 or T7 at the ends of clones indicate the orientation of the clone in the cloning vector. Vertical bars represent either the end of a clone or the position of PstI restriction sites. Letters in between vertical bars above the lines in 5A and 5B indicate the name of the cloned restriction fragment sequenced. For 6A, two contigs are noted. For Pst A2/F4/C2 in 5A, it is known that these fragments are contiguous but their precise location in the clone is not known; this is denoted by the arrows. A general region where a portion of the total DNA prep from 5A and 5B has deleted is indicated. The preps contain both full length copies of the insert and deleted copies so the overall map is accurate. The long solid bar below represents the PKS cluster region, PKS homology identified in the sequencing is located by PKS>above the bar. A scale in kbp is at bottom left.

[0036]FIG. 14A depicts a contig map of cosmid 3A containing the beginning of the PKS cluster. Clones used for sequencing are listed at left, arrows denote the beginning and end of sequence data obtained for each clone. Bar below indicates the number of base pairs in the contig, which was generated by Sequencher, vers. 3.1.

[0037]FIG. 14B is the nucleotide and amino acid sequence of the PKS cluster from clone 3A.

[0038]FIG. 15A depicts a contig map of cosmid 6A downstream of 3A.

[0039]FIG. 15B is the contig sequences from clone 6A.

[0040]FIG. 16A depicts a contig map of cosmid 5A Pst A2/F4/C2 region.

[0041]FIG. 16B is a nucleotide sequence from clone 5A.

[0042]FIG. 17A depicts contigs of sequence overlapping PstI fragments in 5B.

[0043]FIG. 17B is a nucleotide sequence from a portion of clone 5B.

[0044]FIG. 18A depicts a contig map of T7 end of cosmid 5A, through 5B Pst A7, to the T3 end of 5B.

[0045]FIG. 18B is a nucleotide sequence from a portion of clone 5B.

[0046]FIG. 19 depicts PCR amplification products separated by denaturing gradient gel electrophoresis of B. neritina and B. pacifica adult with a universal 16S rRNA primers.

[0047]FIG. 20 depicts HPLC profiles of bryostatin-containing extracts from B. neritina and B. pacifica.

[0048]FIG. 21 depicts phorbol dibutyrate displacement assays of ethanol extracts of bryozoans including B. pacifica showing binding to PKC.

[0049]FIG. 22 depicts exemplary nucleic acid and amino acid sequences.

DETAILED DESCRIPTION OF THE INVENTION

[0050] Definitions

[0051] Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Generally, the nomenclature used herein and the laboratory procedures in cell culture, chemistry, microbiology, molecular biology, cell science and cell culture described below are well known and commonly employed in the art. Conventional methods are used for these procedures, such as those provided in the art and various general references (Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd edition, Cold Spring Harbor Press, Cold Spring Harbor, N.Y. (1989)). Where a term is provided in the singular, the inventors also contemplate the plural of that term. The nomenclature used herein and the laboratory procedures described below are those well known and commonly employed in the art. As employed throughout the disclosure, the following terms, unless otherwise indicated, shall be understood to have the following meanings:

[0052] “Membrane permeant derivative” refers to a chemical derivative of a compound that increases membrane permeability of the compound. These derivatives are made better able to cross cell membranes because hydrophilic groups are masked to provide more hydrophobic derivatives. Also, the making groups can be designed to be cleaved from the compound within a cell to make the compound more hydrophilic once within the cell. Because the substrate is more hydrophilic than the membrane permeant derivative, it preferentially localizes within the cell (U.S. Pat. No. 5,741,657 to Tsien et al., issued Apr. 21, 1998).

[0053] “Isolated polynucleotide” refers to a polynucleotide of genomic, cDNA, PCR or synthetic origin, or some combination thereof, which by virtue of its origin, the isolated polynucleotide (1) is not associated with the cell in which the isolated polynucleotide is found in nature, or (2) is operably linked to a polynucleotide that it is not linked to in nature. The isolated polynucleotide can optionally be linked to promoters, enhancers, or other regulatory sequences.

[0054] “Isolated protein” refers to a protein of cDNA, recombinant RNA, or synthetic origin, or some combination thereof, which by virtue of its origin the isolated protein (1) is not associated with proteins normally found within nature, or (2) is isolated from the cell in which it normally occurs, or (3) is isolated free of other proteins from the same cellular source, for example, free of cellular proteins), or (4) is expressed by a cell from a different species, or (5) does not occur in nature.

[0055] “Polypeptide” is used herein as a generic term to refer to native protein, fragments, or analogs of a polypeptide sequence.

[0056] “Active fragment” refers to a fragment of a parent molecule, such as an organic molecule, nucleic acid molecule, or protein or polypeptide, or combinations thereof, that retains at least one activity of the parent molecule.

[0057] “Naturally occurring” refers to the fact that an object can be found in nature. For example, a polypeptide or polynucleotide sequence that is present in an organism, including viruses, that can be isolated from a source in nature and which has not been intentionally modified by man in the laboratory is naturally occurring.

[0058] “Operably linked” refers to a juxtaposition wherein the components so described are in a relationship permitting them to function in their intended manner. A control sequence operably linked to a coding sequence is ligated in such a way that expression of the coding sequence is achieved under conditions compatible with the control sequences.

[0059] “Control sequences” refer to polynucleotide sequences that effect the expression of coding and non-coding sequences to which they are ligated. The nature of such control sequences differs depending upon the host organism; in prokaryotes, such control sequences generally include promoter, ribosomal biding site, and transcription termination sequences; in eukaryotes, generally, such control sequences include promoters and transcription termination sequences. The term control sequences is intended to include components whose presence can influence expression, and can also include additional components whose presence is advantageous, for example, leader sequences and fusion partner sequences.

[0060] “Polynucleotide” refers to a polymeric form of nucleotides of a least ten bases in length, either ribonucleotides or deoxyribonucleotides or a modified from of either type of nucleotide. The term includes single and double stranded forms of DNA or RNA.

[0061] “Genomic polynucleotide” refers to a portion of the genome.

[0062] “Active genomic polynucleotide” or “active portion of a genome” refer to regions of a genome that can be up regulated, down regulated or both, either directly or indirectly, by a biological process.

[0063] “Directly” in the context of a biological process or processes, refers to direct causation of a process that does not require intermediate steps, usually caused by one molecule contacting or binding to another molecule (the same type or different type of molecule). For example, molecule A contacts molecule B, which causes molecule B to exert effect X that is part of a biological process.

[0064] “Indirectly” in the context of a biological process or precesses, refers to indirect causation that requires intermediate steps, usually caused by two or more direct steps. For example, molecule A contacts molecule B to exert effect X which in turn causes effect Y.

[0065] “Sequence identity” refers to the proportion of base matches between two nucleic acid sequences or the proportion of amino acid matches between two amino acid sequences. When sequence identity is expressed as a percentage, for example 50%, the percentage denotes the proportion of matches of the length of sequences from a desired sequence that is compared to some other sequence. Gaps (in either of the two sequences) are permitted to maximize matching; gap lengths of 15 bases or less are usually used, 6 bases or less are preferred with 2 bases or less more preferred. When using oligonuleotides as probes, the sequence identity between the target nucleic acid and the oligonucleotide sequence is preferably not less than 10 target base matches out of 20 (50% identity) and more preferably not less than about 60% identity, 70% identity, 80% identity or 90% identity), and most preferably not less than 95% identity.

[0066] “Selectively hybridize” refers to detectably and specifically bind. Polynucleotides, oligonucleotides and fragments thereof selectively hybridize to target nucleic acid strands, under hybridization and wash conditions that minimize appreciable amounts of detectable binding to nonspecific nucleic acids. High stringency conditions can be used to achieve selective hybridization conditions as known in the art. Generally, the nucleic acid sequence identity between the polynucleotides, oligonucleotides, and fragments thereof and a nucleic acid sequence of interest will be at least 30%, and more typically and preferably of at least 40%, 50%, 60%, 70%, 80% or 90%.

[0067] Hybridization and washing conditions are typically performed at high stringency according to conventional hybridization procedures. Positive clones are isolated and sequenced. For example, a full length polynucleotide sequence can be labeled and used as a hybridization probe to isolate genomic clones from an appropriate target library as they are known in the art. Typical hybridization conditions and methods for screening plaque lifts and other purposes are known in the art (Benton and Davis, Science 196:180 (1978); Sambrook et al., supra, (1989)).

[0068] In particular, moderate and stringent hybridization conditions are well known to the art, see, for example, sections 9.47-9.51 of Sambrook et al. (Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. (1989)). For example, stringent conditions are those that (1) employ low ionic strength and high temperature for washing, for example, 0.015 M NaCl/0.0015 M sodium citrate (SSC); 0.1% sodium lauryl sulfate (SDS) at 50° C., or (2) employ a denaturing agent such as formamide during hybridization, e.g., 50% formamide with 0.1% bovine serum albumin/0.1% Ficoll/0.1% polyvinylpyrrolidone/50 mM sodium phosphate buffer at pH 6.5 with 750 mM NaCl, 75 mM sodium citrate at 42° C. Another example is use of 50% formamide, 5×SSC (0.75 M NaCl, 0.075 M sodium citrate), 50 mM sodium phosphate (pH 6.8), 0.1% sodium pyrophosphate, 5×Denhardt's solution, sonicated salmon sperm DNA (50 μg/ml), 0.1% sodium dodecylsulfate (SDS), and 10% dextran sulfate at 42° C., with washes at 42° C. in 0.2×SSC and 0.1% SDS.

[0069] Two amino acid sequences have share identity if there is a partial or complete identity between their sequences. For example, 85% identity means that 85% of the amino acids are identical when the two sequences are aligned for maximum matching. Gaps (in either of the two sequences being matched) are allowed in maximizing matching; gap lengths of 5 or less are preferred with 2 or less being more preferred. Alternatively and preferably, two protein sequences (or polypeptide sequences derived from them of at least 30 amino acids in length) share identity, as this term is used herein, if they have an alignment score of at least 5 (in standard deviation units) using the program ALIGN with the mutation data matrix and a gap penalty of 6 or greater (Dayhoff, in Atlas of Protein Sequence and Structure, National Biomedical Research Foundation, volume 5, pp. 101-110 (1972) and Supplement 2, pp. 1-10).

[0070] “Corresponds to” refers to a polynucleotide sequence that shares identity (for example is identical) to all or a portion of a reference polynucleotide sequence, or that a polypeptide sequence is identical to all or a portion of a reference polypeptide sequence. In contradistinction, the term “complementary to” is used herein to mean that the complementary sequence is homologous to all or a portion of a reference polynucleotide sequence. For illustration, the nucleotide sequence TATAC corresponds to a reference sequence TATAC and is complementary to a reference sequence GTATA.

[0071] The following terms are used to describe the sequence relationships between two or more polynucleotides: “reference sequence,” “comparison window,” “sequence identity,” “percentage of sequence identity,” and “substantial identity.” A reference sequence is a defined sequence used as a basis for a sequence comparison; a reference sequence can be a subset of a larger sequence, for example, as a segment of a full length cDNA or gene sequence given in a sequence listing, or may comprise a complete cDNA or gene sequence. Generally, a reference sequence is at least 20 nucleotides in length, frequently at least 25 nucleotides in length, and often at least 50 nucleotides in length. Since two polynucleotides can each (1) comprise a sequence (for example a portion of the complete polynucleotide sequence) that is similar between the two polynucleotides, and (2) may further comprise a sequence that is divergent between the two polynucleotides, sequence comparisons between two (or more) polynucleotides are typically performed by comparing sequences of the two polynucleotides over a “comparison window” to identify and compare local regions of sequence similarity. A comparison window, as used herein, refers to a conceptual segment of at least 20 contiguous nucleotide positions wherein a polynucleotide sequence may be compared to a reference sequence of at least 20 contiguous nucleotides and wherein the portion of the polynucleotide sequence in the comparison window can comprise additions and deletions (for example, gaps) of 20 percent or less as compared to the reference sequence (which would not comprise additions or deletions) for optimal alignment of the two sequences. Optimal alignment of sequences for aligning a comparison window can be conducted by the local identity algorithm (Smith and Waterman, Adv. Appl. Math., 2:482 (1981)), by the identity alignment algorithm (Needleman and Wunsch, J. Mol. Bio., 48:443 (1970)), by the search for similarity method (Pearson and Lipman, Proc. Natl. Acid. Sci. U.S.A. 85:2444 (1988)), by the computerized implementations of these algorithms such as GAP, BESTFIT, FASTA and TFASTA (Wisconsin Genetics Software Page Release 7.0, Genetics Computer Group, Madison, Wis.), or by inspection. Preferably, the best alignment (for example, the result having the highest percentage of identity over the comparison window) generated by the various methods is selected.

[0072] “Complete sequence identity” means that two polynucleotide sequences are identical (for example, on a nucleotide-by-nucleotide basis) over the window of comparison.

[0073] “Percentage of sequence identity” is calculated by comparing two optimally aligned sequences over the window of comparison, determining the number of positions at which the identical nucleic acid base occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison (for example, the window size), and multiplying the result by 100 to yield the percentage of sequence identity.

[0074] “Substantial identity” as used herein denotes a characteristic of a polynucleotide sequence, wherein the polynucleotide comprises a sequence that has at least 30 percent sequence identity, preferably at least 50 to 60 percent sequence, more usually at least 60 percent sequence identity as compared to a reference sequence over a comparison window of at least 20 nucleotide positions, frequently over a window of at least 25 to 50 nucleotides, wherein the percentage of sequence identity is calculated by comparing the reference sequence to the polynucleotide sequence that may include deletions or addition which total 20 percent or less of the reference sequence over the window of comparison.

[0075] “Substantial identity” as applied to polypeptides herein means that two peptide sequences, when optimally aligned, such as by the programs GAP or BESTFIT using default gap weights, share at least 30 percent sequence identity, preferably at least 40 percent sequence identity, and more preferably at least 50 percent sequence identity, and most preferably at lest 60 percent sequence identity. Preferably, residue positions, which are not identical, differ by conservative amino acid substitutions.

[0076] “Conservative amino acid substitutions” refer to the interchangeability of residues having similar side chains. For example, a group of amino acids having aliphatic side chains is glycine, alanine, valine, leucine, and isoleucine; a group of amino acids having aliphatic-hydroxyl side chains is serine and threonine; a group of amino acids having amide-containing side chains is asparagine and glutamine; a group of amino acids having aromatic side chains is phenylalanine, tyrosine and tryptophan; a group of amino acids having basic side chains is lysine, arginine and histidine; and a group of amino acids having sulfur-containing side chan is cystein and methionine. Preferred conservative amino acid substitution groups are: valine-leucine-isoleucine; phenylalanine-tyrosine; lysine-arginine; alanine-valine; glutamic-aspartic; and asparagine-glutamine.

[0077] “Modulation” refers to the capacity to either enhance or inhibit a functional property of a biological activity or process, for example, enzyme activity or receptor binding. Such enhancement or inhibition may be contingent on the occurrence of a specific event, such as activation of a signal transduction pathway and/or may be manifest only in particular cell types.

[0078] “Modulator” refers to a chemical (naturally occurring or non-naturally occurring), such as a biological macromolecule (for example, nucleic acid, protein, non-peptide or organic molecule) or an extract made from biological materials, such as prokaryotes, bacteria, eukaryotes, plants, fungi, multicellular organisms or animals, invertebrates, vertebrates, mammals and humans, including, where appropriate, extracts of: whole organisms or portions of organisms, cells, organs, tissues, fluids, whole cultures or portions of cultures, or environmental samples or portions thereof. Modulators are typically evaluated for potential activity as inhibitors or activators (directly or indirectly) of a biological process or processes (for example, agonist, partial antagonist, partial agonist, antagonist, antineoplastic, cytotoxic, inhibitors of neoplastic transformation or cell proliferation, cell proliferation promoting agents, antiviral agents, antimicrobial agents, antibacterial agents, antibiotics, and the like) by inclusion in assays described herein. The activity of a modulator may be known, unknown or partially known.

[0079] “Test chemical” refers to a chemical or extract to be tested by at least one method of the present invention to be a putative modulator. A test chemical is usually not known to bind to the target of interest. “Control test chemical” refers to a chemical known to bind to the target (for example, a known agonist, antagonist, partial agonist or inverse agonist). Test chemical does not typically include a chemical added to a mixture as a control condition that alters the function of the target to determine signal specificity in an assay. Such control chemicals or conditions include chemicals that (1) non-specifically or substantially disrupt protein structure (for example denaturing agents such as urea or guandium, sulfhydryl reagents such as dithiotritol and beta-mercaptoethanol), (2) generally inhibit cell metabolism (for example mitochondrial uncouples) and (3) non-specifically disrupt electrostatic or hydrophobic interactions of a protein (for example, high salt concentrations or detergents at concentrations sufficient to non-specifically disrupt hydrophobic or electrostatic interactions). The term test chemical also does not typically include chemicals known to be unsuitable for a therapeutic use for a particular indication due to toxicity of the subject. Usually, various predetermined concentrations of test chemicals are used for determining their activity. If the molecular weight of a test chemical is known, the following ranges of concentrations can be used: between about 0.001 micromolar and about 10 millimolar, preferably between about 0.01 micromolar and about 1 millimolar, more preferably between about 0.1 micromolar and about 100 micromolar. When extracts are uses a test chemicals, the concentration of test chemical used can be expressed on a weight to volume basis. Under these circumstances, the following ranges of concentrations can be used: between about 0.001 micrograms/ml and about 100 milligram/ml, preferably between about 0.01 micrograms/ml and about 10 milligrams/ml, and more preferably between about 0.1 micrograms/ml and about 1 milligrams/ml or between about 1 microgram/ml and about 100 micrograms/ml.

[0080] “Target” refers to a biochemical entity involved in a biological process. Targets are typically proteins that play a useful role in the physiology or biology of an organism. A therapeutic chemical typically binds to a target to alter or modulate its function. As used herein, targets can include, but not be limited to, cell surface receptors, G-proteins, G-protein coupled receptors, kinases, phosphatases, ion channels, lipases, phosholipases, nuclear receptors, intracellular structures, tubules, tubulin, and the like.

[0081] “Label” or “labeled” refers to incorporation of a detectable marker, for example by incorporation of a radiolabled compound or attachment to a polypeptide of moieties such as biotin that can be detected by the binding of a section moiety, such as marked avidin. Various methods of labeling polypeptide, nucleic acids, carbohydrates, and other biological or organic molecules are known in the art. Such labels can have a variety of readouts, such as radioactivity, fluorescence, color, chemiluminescence or other readouts known in the art or later developed. The readouts can be based on enzymatic activity, such as beta-galactosidase, beta-lactamase, horseradish peroxidase, alkaline phosphatase, luciferase; radioisotopes such as ³H, ¹⁴C, ³⁵S, ¹²⁵I or ¹³¹I); fluorescent proteins, such as green fluorescent proteins; or other fluorescent labels, such as FITC, rhodamine, and lanthanides. Where appropriate, these labels can be the product of the expression of reporter genes, as that term is understood in the art. Examples of reporter genes are beta-lactamase (U.S. Pat. No. 5,741,657 to Tsien et al., issued Apr. 21, 1998) and green fluorescent protein (U.S. Pat. No. 5,777,079 to Tsien et al., issued Jul. 7, 1998; U.S. Pat. No. 5,804,387 to Cormack et al., issued Sep. 8, 1998).

[0082] “Substantially pure” refers to an object species or activity that is the predominant species or activity present (for example on a molar basis it is more abundant than any other individual species or activities in the composition) and preferably a substantially purified fraction is a composition wherein the object species or activity comprises at least about 50 percent (on a molar, weight or activity basis) of all macromolecules or activities present. Generally, as substantially pure composition will comprise more than about 80 percent of all macromolecular species or activities present in a composition, more preferably more than about 85%, 90%, 95% and 99%. Most preferably, the object species or activity is purified to essential homogeneity, wherein contaminant species or activities cannot be detected by conventional detection methods) wherein the composition consists essentially of a single macromolecular species or activity. The inventors recognize that an activity may be caused, directly or indirectly, by a single species or a plurality of species within a composition, particularly with extracts.

[0083] A “PKS activity” is at least one activity of at least one PKS, such as, for example, an aromatic PKS system, a modular PKS system or a fungal PKS system.

[0084] An “aromatic PKS system” refers to a PKS system characterized by the iterative use of the catalytic sites on the several enzymes produced. Thus, in aromatic PKS systems, only one enzyme with a specific type of activity is produced to catalyze the relevant activity for the system throughout the synthesis of the polyketide. In aromatic PKS systems, the enzymes of the minimal PKS are encoded in separate open reading frames (ORFs). The actinorhodin PKS is encoded in six separate ORFs. For the minimal PKS, one ORF contains a ketosynthase (KS) and an acyultransferase (AT); a second ORF contains what is believed to be a chain-length factor (CLF); and a third reading frame encodes an acyl carrier protein (ACP). Additional ORFs encode an aromatase (ARO), a cyclase (CYC), and a ketoreductase (KR). The combination of a KS/AT, ACP, and CLF constitutes a minimal PKS, since these elements provide for a single condensation of a two-carbon unit. Furthermore, the gris PKS contains five separate ORFs wherein the KS/AT, CLF, and ACP are on three ORFs, the KR is on a fourth, and the ARO is on a fifth (WO 98/27203 to Barr et al., published Jun. 25, 1998).

[0085] A “modular PKS system” refers to a PKS system where each catalytic site is used only once and the entire PKS is encoded as a series of modules. Thus, the modular synthase protein contains a multiplicity of catalytic sites having the same type of catalytic activity. A minimal module contains at least a KS, an AT and an ACP. Optional additional activities include KR, DH, and enoylreductase (ER) and a thoesterast (TE) activity.

[0086] A “fungal PKS” encoding a 6-methyl salicylic acid synthase (6-MSAS) has some similarity to both the aromatic and modular PKS. It has only one reading frame for KS, AT, a dehydratase (DH), KR and ACP. Thus, it appears similar to a single module of a modular PKS. Unlike an aromatic PKS, it does not include a CLF.

[0087] “Pharmaceutical agent or drug” refers to a chemical, composition or activity capable of inducing a desired therapeutic effect when property administered by an appropriate dose, regime, route of administration, time and delivery modality.

[0088] “Pharmaceutical agent or drug” refers to a chemical, composition or activity capable of inducing a desired therapeutic effect when property administered by an appropriate dose, regime, route of administration, time and delivery modality.

[0089] A “bioactive compound” refers to a compound that exhibits at least one bioactivity.

[0090] A “bioactivity” refers to a composition that exhibits at least one activity that modulates a biological process. Preferred bioactivities include, but are not limited to: antibacterial activity, antimicrobial activity, not being substantially susceptible to multi-drug resistance, antiviral activity, antitumor activity, anticancer cell activity, immunomodulatory activity, anti-inflammatory activity, radiation protective activity, modulating a protein kinase C (PKC) activity or other kinase activity and cytotoxic activity.

[0091] “A bryostatin” or “bryostatin” or “bryostatins” refers to a compound that includes a bryopyran ring and has at least one bioactivity.

[0092] “Made at least in part” refers to a bioactive compound or bioactivity whose bioactivity derives at least in part from an activity of an entity, such as a marine organism.

[0093] A “bioactive derivative” refers to a modification of a bioactive compound or bioactivity that retains at least one characteristic activity of the parent compound.

[0094] A “bioactive precursor” refers to a precursor of a bioactive compound or bioactivity that exhibits at least one characteristic activity of the resulting bioactive compound or bioactivity.

[0095] An “antibacterial activity” refers to an activity that reduces the growth rate or numbers of living bacteria in a sample, such as a culture of bacteria or a sample that includes at least one bacteria, including a patient. Such antibacterial activity can be directed against Gram-negative and Gram-positive bacteria and can be screened for or confirmed using methods known in the art.

[0096] An “antimicrobial activity” refers to an activity that reduces the growth rate or numbers of living microbes in a sample (including prokaryotic and/or eukaryotic microbes), such as a culture of microbes or a sample that includes at least one microbe, including a patient and can be screened for or confirmed using methods known in the art.

[0097] “Not substantially susceptible to multiple drug resistance” refers to cells that exhibit multiple drug resistance, such as a against methicillin, vancomycin, bryostatin or taxol, cannot survive or propagate at their usual rate in the presence of a bioactive compound. Such an activity can be confirmed using methods known in the art.

[0098] An “antiviral activity” refers to an activity that reduces the infectivity of virus particles in a sample, such as in a sample including at least one virus, including a patient. Such antiviral activity can be directed against, for example, DNA or RNA containing viruses, including, but not limited to herpesvirus, hepatitis virus and retrovirus. Such activity can be screened for using methods known in the art.

[0099] An “antitumor activity” refers to an activity that reduces the growth rate or number of tumor cells in a sample, such as a culture of tumor cells or a sample that includes at least one tumor cell, including a patient. Such antitumor activity can be directed against any type of tumor or tumor cell, including, but not limited to renal tumor, lung tumor, colon tumor, central nervous system tumor, melanoma, ovarian tumor and breast tumor.

[0100] An “anticancer cell activity” refers to an activity that reduces the growth rate or number of cancer cells in a sample, such as a culture of cancer cells or a sample that includes at least one cancer cell, including a patient. Such anticancer cell activity can be directed against any type of cancer cell, including, but not limited to renal cancer, leukemia, lung cancer, colon cancer, central nervous system cancer, melanoma, ovarian cancer and breast cancer.

[0101] An “immunomodulatory activity” refers to an activity that can modulate either or both of the cellular or humoral branch of the immune system of a subject. For example, the modulation, increase or decrease of the activity of the cellular immune response, humoral immune response, or both, can be measured using methods known in the art.

[0102] An “anti-inflammatory activity” refers to an activity that reduces the severity or occurrence of an inflamation response. Such activity can be screened using methods known in the art.

[0103] A “radiation protective activity” refers to an activity that reduces the severity or occurrence of cellular damage or mutation due to exposure to radiation. Such activity can be screened using methods known in the art.

[0104] “Modulate PKC activity” refers to the ability of a compound to bind or modulate at least one activity of at least one PKC. Preferably, the modulation results from the binding of a compound to a PKC.

[0105] A “cytotoxic activity” refers to an activity that reduces the number of viable cells in a sample, including prokaryotic cells, eukaryotic cells or both. Such activity can be screened using methods known in the art.

[0106] A “patient” or “subject” refers a whole organism in need of treatment, such as a farm animal, companion animal or human. An animal refers to any non-human animal.

[0107] Other technical terms used herein have their ordinary meaning in the art that they are used, as exemplified by a variety of technical dictionaries, such as the McGraw-Hill Dictionary of Chemical Terms and the Stedman's Medical Dictionary.

[0108] Introduction

[0109] The present invention recognizes that marine organisms comprise nucleic acid molecules that encode polypeptides that catalyze the synthesis of bioactive compounds, such as polyketides including bryopyran rings, such as bryostatins.

[0110] Polyketide synthase (PKS) genes expected to encode polypeptides necessary to synthesize bryostatins are provided, along with methods for identifying and isolating the PKS genes needed to recombinantly biosynthesize related polyketide molecules through combinatorial synthesis. The cloned genes can also be used to screen environmental samples for novel PKS genes. The cloned genes and linked genes involved in bryostatin synthesis may be transformed and expressed in a desired host organism to produce bryostatins or derivatives thereof for a variety of purposes, including anti-cancer compounds, immunomodulatory compounds, anti-microbial, and anti-fungal compounds.

[0111] Bryostatins are a unique family of cytotoxic macrolides based on the bryopyran ring system (Pettit, 1991) (FIG. 1). They occur exclusively in the marine bryozoan Bugula neritina. Bryostatin 1 is now in Phase II clinical trials for the treatment of leukemias, lymphomas, melanoma and solid tumors (Pluda et al., 1996). Bryostatin 1 also shows promise for treatment of ovarian and breast cancer and to enhance lymphocyte survival during radiation treatment (Kraft, 1993); (Lind et al., 1993); (Grant et al., 1994); (Scheid et al., 1994); (Sung et al., 1994); (Correale et al., 1995); (Fleming et al., 1995); (Baldwin et al., 1997); (Lipshy et al., 1997); (Taylor et al., 1997); (Basu, 1998); (Johnson et al., 1999). Other bryostatins may ultimately prove even more valuable. In addition, this structure offers exciting possibilities for combinatorial biosynthesis. The cloned genes can be used for combinatorial creation of novel polyketide/bryostatin analogs that may exhibit improved anti-cancer properties. Bryostatins are complex polyketides similar to bacterial secondary metabolites biosynthesized by modular Type I polyketide synthases (PKS-I). Research and development of the bryostatins is currently severely limited by inadequate availability of bryostatins. B. neritina is harbors an uncultivated symbiont, the gamma proteobacterium Candidatus Endobugula sertula. Currently, B. neritina is the exclusive source of the bryostatins.

[0112] Unlike most chemotherapeutic agents that kill rapidly dividing cells, bryostatins act on signal transduction pathways by binding to the activator site of protein kinase C (PKC) (Steube and Drexler, 1993); (Caponigro et al., 1997). Eighteen bryostatins have been described (Pettit et al., 1982); (Pettit, 1991); (Pettit et al., 1996). These vary primarily in the substituents at C-7 and C-20.

[0113] The major obstacle in investigating and developing bryostatins as anti-cancer agents or for other therapeutic purposes is the difficulty of obtaining them in ample quantities. The yield of bryostatin 1 is low; in the large-scale isolation for clinical trials it was 1.4 micrograms per gram wet weight of B. neritina. (Schaufelberger et al., 1991). The supply of B. neritina is unpredictable and harvesting has long-term negative effects on colonies. Research has focused on bryostatin 1, but the other bryostatins have been isolated on the basis of their antileukemic activity. With the exception of bryostatins 16 and 17, all possess the structural features believed to account for the activity of bryostatin 1 (Pettit et al., 1982); (Pettit, 1991); (Pettit et al., 1991); (Pettit et al., 1996). Other bryostatins may equal or exceed the therapeutic value of bryostatin 1. One study showed that bryostatins 5 and 8 are as effective as bryostatin 1 in treating melanoma, but with milder side effects (Kraft et al., 1996). The unusual biological activities of bryostatins suggest that novel structures based on the bryopyran ring will likely have useful properties as potential drugs. Greater availability of bryostatins is essential to permit research to unlock the potential of this remarkable family of compounds. Although aquaculture would provide a more consistent source of bryostatins, it does not improve the low yield. Large-scale chemical synthesis of bryostatins is currently considered impractical due to their structural complexity (Kageyama et al., 1990); (Wender et al., 1998).

[0114] All polyketide biosynthetic systems for bacterial macrolide compounds, that have been studied to date, are similar to the type I fatty acid synthase (FAS) in that they use large multifunctional polypeptides (Schupp et al., 1995). Hence the term type I PKS (PKS-I).

[0115] Complex polyketides, like the bryostatins, are typically synthesized by bacteria. B. neritina has a specific bacterial symbiont “Candidatus Endobugula sertula” (Haygood and Davidson, 1997) that the data below suggests can be the true source of the bryostatins. Cloning and expressing the bryostatin genes in a heterologous systems, or cultivating E. sertula (or another B. neritina-associated bacteria which is the source of bryostatin), would prevent a supply problem. In addition, the genes involved in bryostatin synthesis could be combined with other polyketide synthase genes to result in the creation of novel drugs and drug analogs with improved activities.

[0116] The majority of pharmaceuticals used in the treatment of breast and other cancers are cytotoxic or cytostatic inhibitors of tumor growth. Despite the use of this type of drug, along with surgery and radiotherapy, in the treatment of the disease, the breast cancer death rate has not decreased (Dickson et al., 1996). This can be attributed to many factors including rising incidence, resistance to therapy, and metastasis of the disease. Since distant metastasis of breast cancer is only indirectly related to tumor size, a logical approach would be to discover drugs which directly interfere with the complex process of metastasis (Dickson et al., 1996). The drug discussed in this patent proposal, bryostatin 1, has shown among other anti-cancer activities, antimetastatic activity (Dickson et al., 1996); (Johnson et al., 1999).

[0117] Metastasis of breast cancer involves a multistep process of coordinated gene expression by tumor cells. The progression from primary tumor to metastasis involves a number of malignant characteristics including altered cell-cell and cell-substratum adhesion, increased motility, elaboration of proteases, altered growth control and the ability to produce angiogenic factors (Liotta and Kohn, 1990); (Johnson et al., 1999). These processes are believed to be modulated to some extent by the central signaling pathways of cells. Many of the above properties have been shown in various model systems to be regulated by protein kinase C (PKC)-mediated pathways; agents that modulate PKC have been shown to alter the rate of metastasis in some animal models (Johnson et al., 1999). Increased protein kinase C (PKC) activity in malignant breast tissue and positive correlations between PKC activity and expression of a more aggressive phenotype in breast cancer cell lines, suggest a role for this signal transduction pathway in the pathogenesis and/or progression of breast cancer. Thus, findings suggest that the PKC pathway may modulate progression of breast cancer to a more aggressive neoplastic process (Ways et al., 1995). Numerous studies suggest the use of PKC modulators, such as bryostatin 1, as anti-invasive and/or antimetastatic agents in the treatment of breast cancer (Johnson et al., 1999).

[0118] Recently, an exciting potential therapeutic use of PKC modulators has emerged. PKC modulators can interact with many chemotherapeutic agents and potentiate their activity (Caponigro et al., 1997). PKC isozymes have been implicated in the regulation of the multidrug resistance (MDR) phenotype. The MDR phenotype is expressed by some tumor cell populations, in which a drug efflux pump is activated with consequent cross-resistance to major classes of anticancer drugs in clinical use (vinca alkaloids, anthracyclines, podophillotoxins, taxanes) due to reduced intracellular drug accumulation (Korczak et al., 1989); (Caponigro et al., 1997). The MDR phenotype is accompanied by changes in the PKC activity and many observations indicate a role for PKC in the regulation of this phenotype (Fine et al., 1988); (Caponigro et al., 1997). There is preclinical evidence of antiproliferative activity of PKC modulators (Johnson et al., 1999). In addition, encouraging results have been obtained in combined administration of PKC modulators and other cytotoxic drugs, including those involved in the MDR phenotype (Caponigro et al., 1997). In contrast to the often severe effects of other MDR reversal agents, PKC modulators appear to act through a different mechanism. In the only clinical trail of a drug belonging to this class, used in combination with doxorubicin, serum levels approximating those that potentiate the effects of chemotherapy in tumor-bearing animals were achieved without significant toxicity, while no pharmokinetic interaction has been recorded (Jayson et al., 1995). Thus, drugs such as bryostatin 1 targeting PKC may be useful as a means of counteracting drug resistance during cancer chemotherapy.

[0119] Studies also indicate that bryostatin 1 may be an effective cancer treatment when combined with other drugs. For example, bryostatin 1 in combination with IL-2 in vitro enhances proliferation and IL-2 receptor expression on lymphocytes, favoring CD8+ cells while suppressing the generation of lymphokine-activated killer (LAK) activity (Scheid et al., 1994). In cancer patients, intravenous administration of bryostatin 1 increased the potential of IL-2 to induce proliferation of LAK activity in lymphocytes, which the authors suggest makes bryostatin an interesting candidate for clinical trials in combination with IL-2 (Scheid et al., 1994). With respect to breast cancer, a recent study suggested that bryostatin 1 sensitized human breast cancer cells to the cytotoxic effects of gemcitabine (Philip et al., 1999).

[0120] Thus, bryostatin 1 and analogs thereof derived from combinatorial biosynthesis are excellent candidates for the treatment of breast cancer. Bryostatin 1 has tremendous potential as a breast cancer treatment based on antineoplastic activity, antimetastatic activity and immunostimulation during chemotherapy. Bryostatin has even greater potential in combination therapy as adjuncts to known anticancer agents against the MDR phenotype. However, large-scale clinical studies and ultimate supply of bryostatin will be hampered by a supply problem, as was the case for taxol. The application of this patent (ie. cloning and expressing the bryostatin biosynthesis genes) could avoid this problem. Furthermore, the cloning of the bryostatin biosynthesis genes could lead to combinatorial biosynthesis of bryostatin analogs exhibiting improved anti-cancer properties.

[0121] The structure of bryostatins and biosynthesis thereof suggest that it is synthesized by a Type I polyketide synthase (PKS-I). The cloning and expression of this polyketide synthase and associated tailoring enzymes from B. neritina would allow production of essentially unlimited amounts of bryostatins. In addition, the structure of bryostatin offers exciting possibilities for combinatorial biosynthesis of a wide variety of compounds, including novel compounds. The cloned genes can used for combinatorial creation of novel bryostatin analogs that may exhibit improved properties.

[0122] As a non-limiting introduction to the breath of the present invention, the present invention includes several general and useful aspects, including:

[0123] 1) a composition including at least one nucleic acid molecule that encodes at least one polypeptide that catalyzes at least one step in the synthesis of at least one polyketide such as a bryopyran ring, wherein said at least one nucleic acid is derived from at least one marine organism;

[0124] 2) a composition including a library of nucleic acid molecules of 1);

[0125] 3) a composition including at least one polypeptide that catalyzes at least one step in the synthesis of at least one polyketide such as a bryopyran ring, wherein said at least one polypeptide is derived from at least one marine organism;

[0126] 4) a composition including a library of polypeptides of 3);

[0127] 5) a method of making a composition including providing at least one composition of 1), and synthesizing at least one polyketide such as a bryopyran ring;

[0128] 6) a composition made by the method of 5);

[0129] 7) a method of making a composition including providing at least one composition of 3), and synthesizing at least one bryopyran ring;

[0130] 8) a composition made by the method of 7);

[0131] 9) a method for identifying at least one nucleic acid molecule encoding at least one activity of a PKS including contacting a nucleic acid molecule of 1) with a sample, and identifying nucleic acid molecules in said sample that hybridize with said nucleic acid molecule of 1);

[0132] 10) a nucleic acid molecule identified by the method of 9);

[0133] 11) a composition comprising a library of nucleic acid molecules of 10);

[0134] 12) a method for identifying a bioactive compound including contacting a composition of 5) and determining the bioactivity of said compound;

[0135] 13) a method for identifying a bioactive compound including contacting a composition of 8) and determining the bioactivity of said compound;

[0136] 14) a preparation of bacteria from a Bugula that include PKS genes; and

[0137] 15) a composition comprising at least one polyketide, bryopyran ring or bryostatin present in Bugula pacifica.

[0138] These aspects of the invention, as well as others described herein, can be achieved by using the methods, articles of manufacture and compositions of matter described herein. To gain a full appreciation of the scope of the present invention, it will be further recognized that various aspects of the present invention can be combined to make desirable embodiments of the invention.

[0139] I. Nucleic Acid Molecules That Encode Polypeptides That Catalyzes the Synthesis of Polyketides Such as Bryopyran Rings and Libraries of Such Nucleic Acid Molecules

[0140] The present invention includes a composition including at least one nucleic acid molecule, such as a substantially purified or purified nucleic acid molecule, that encodes at least a portion of at least one polypeptide that catalyzes at least one step in the synthesis of at least one polyketide such as a bryopyran ring, such as a bryostatin. Preferably, at least one nucleic acid molecule is derived from at least one marine organism. The nucleic acid molecules of the present invention can comprise the nucleic acid molecules disclosed herein, including PCR primers, portions thereof, and nucleic acid molecules that selectively hybridize with or have substantial identity with the nucleic acid molecules disclosed herein or portions thereof, or encode at least one conservative amino acid substitution relative to the disclosed sequences or portions thereof. A nucleic acid molecule of the present invention can be DNA, RNA, single stranded, double stranded or any combination thereof.

[0141] A nucleic acid molecule of the present invention preferably encodes at least a portion of a polypeptide involved in the synthesis of at least one polyketide. Preferably, the polypeptide is at least a portion of a polyketide synthase, including PKS type I or PKS type II enzymes. Preferably, the polyketide synthase is a PKS type I enzyme, which can include a plurality of active domains that are involved in the synthesis of a polyketide, such as a bryopyran ring. A nucleic acid molecule of the present invention preferably encodes at least a portion of at least one such active domain and can include at least one activity of such an active domain, preferably an activity that catalyzes at least one step in the synthesis of a polyketide. Preferably, a nucleic acid molecule of the present invention is between about 1 Kb and about 100 Kb, between about 5 Kb and about 50 Kb or between about 10 Kb and about 25 Kb in length and about 100 Kb in length.

[0142] A nucleic acid molecule of the present invention can be derived from at least one marine organism. A marine organism can include any organism that can be found in a marine environment, either naturally or xenotypically. A marine organism can be a vertebrate, an invertebrate or a unicellular organism, such as a fungi, algae or bacteria. Preferably, a marine organism is an invertebrate, such as a Bugula, such as Bugula neritina or Bugula pacifica, or a unicellular organism, such as a bacteria, such as an Endobugula, such as an Endobugula sertula.

[0143] A nucleic acid molecule of the present invention from a marine organism can be characterized as having an unusually low G:C content, for example between about 35% and about 55%, which can vary depending on the particular marine organism. Certain nucleic acid molecules of the present invention exemplified in SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27 and SEQ ID NO:29 through SEQ ID NO:37 or portions thereof or nucleic acid molecules including at least a portion thereof have a G:C content ranging from about 35% to about 55%. This low G:C content is particularly noted in symbionts of Bugula neritina and Bugula pacifica.

[0144] A nucleic acid of the present invention can also encode a fusion protein that includes a polypeptide of the present invention and a polypeptide of interest. A polypeptide of interest can be any polypeptide, but is preferably a detectable label, such as green fluorescent protein, or a sequence that aids in the purification of a polypeptide, such as FLAG. A nucleic acid that encodes a fusion protein can be made by operably linking a nucleic acid that encodes a polypeptide of interest with a polypeptide of the present invention. The operably linking can be direct or indirect, such as in the case where a linker connects the polypeptide of the present invention with a polypeptide of interest. The nucleic acid molecule of the present invention and the nucleic acid that encodes a polypeptide of interest are preferably operably linked in frame such that an operable polypeptide of the present invention and an operable polypeptide of interest are translated from the nucleic acid, but that need not be the case.

[0145] Nucleic acid molecules of the present invention can be made using methods known in the art and described herein (see, Sambrook et al., supra (1989)). For example, nucleic acid molecules of the present invention can be identified and isolated using PCR methodologies, including RT-PCR, and sequenced using established methods such that their homologies can be determined. The ability of one nucleic acid molecule to hybridize with another can be determined through experimentation under a variety of stringencies, or can be estimated based on their length and G:C contents. Alterations of identified sequences can be made using routine methods, such as mutagenesis, RT-PCR or other PCR methods (See, Sambrook et al, supra, (1989)).

[0146] A nucleic acid molecule of the present invention can include at least one expression control sequence. Preferably, an expression control sequence is operably linked to a nucleic acid molecule such that the nucleic acid molecule can be expressed in an in vivo or in vitro transcription and/or translation system. The choice of expression control sequences is dependent upon the transcription system to be used. For example, if a prokaryotic organism such as E. coli is to be used to express a nucleic acid molecule, then at least one appropriate prokaryotic expression control sequence would be used. Likewise, if a eukaryotic organism is to be used to express a nucleic acid molecule, then at least one appropriate eukaryotic expression control sequence, such as CMV or LTRs would be used. Such nucleic acid molecules can be in any form, such as in a plasmid or in a linear form.

[0147] A nucleic acid molecule of the present invention can be provided with or without expression control sequences in a vector, such as a plasmid or a viral vector. Viral vectors can be chosen so that they are appropriate for a cell to be transfected, such as, for example, a phage, cosmid, retrovirus, vaccinia, adenovirus or adenoassociated virus. Viral vectors can introduce a nucleic acid molecule into a cell during its normal biological processes. Non-viral vectors can be used to introduce a nucleic acid molecule of the present invention into a host cell using methods known in the art, such as lipofection, cold calcium chloride or electroporation. The nucleic acid molecule in a cell can be extrachromosomal or be integrated into the genome of the cell. The host cell can be any appropriate host cell, such as a eukaryotic or prokaryotic cell. Preferably, a nucleic acid molecule of the present invention is expressed in the cell, but that is not a requirement of the invention. Preferably, the cell does not normally include a nucleic acid molecule of the present invention or express a polypeptide of the present invention, but that need not be the case. For example, a cell that expresses a relatively low amount of a polypeptide of the present invention can be made to express relatively higher amounts of a polypeptide once transfected with a nucleic acid of the present invention.

[0148] Cells that express a polypeptide of the present invention can be screened for and selected using a variety of methods, including those set forth in the present invention. For example, immunoassays, such as western blots, can be used to identify cell lysates that include a polypeptide of the present invention. In addition, immunocytochemistry can be used can be used to identify and localize a polypeptide of the present invention on or within a cell. Furthermore, in situ hybridization methods, such as FISH, can be used to identify and localize nucleic acid molecules within a cell and hybridization methods can be used to identify nucleic acid molecules, either DNA or RNA for cellular preparations. Cells or cell lysates can be screened for an activity using a variety of methods. For example, the ability of a cell or cell lysate to bind with a substrate or convert a substrate, including a detectably labeled substrate, can be used to detect a particular activity (see Haygood and Davidson, 1997).

[0149] As set forth in the Examples and exemplified in SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27 and SEQ ID NO:29 through SEQ ID NO:37 or portions thereof or nucleic acid molecules including at least a portion thereof, nucleic acid molecules of the present invention can encode peptides that encode PKS activity. Nucleic acid molecules having PKS activity, or other activities associated with PKS, can be identified by making comparisons of nucleic acid sequence or translation amino acid sequences derived therefrom using methods known in the art, including BLAST comparisons. A nucleic acid molecule of the present invention can be expressed and the expression products screened and confirmed for having PKS activity (see, McDaniel et al., 1999; Shen et al. 1999). In addition, nucleic acid molecules of the present invention can encode polypeptides that have other activities of PKS. Methods for screening such activities are known in the art (see, McDaniel et al., 1999).

[0150] The present invention also includes a library of nucleic acids of the present invention. A library of nucleic acids includes between about two, about four, about six, about eight, about ten, about thirty, about seventy, about one-hundred, about one-thousand, about ten-thousand, about one-hundred thousand or about one-million nucleic acid molecules and about three, about five, about seven, about twenty, about fifty, about five-hundred, about fifty-thousand, about five-hundred thousand and about ten million nucleic acid molecules. The members of such a library are preferably different nucleic acid molecules, but that need not be the case.

[0151] The nucleic acid molecules of the present invention can be used for a variety of applications, including but not limited to, PCR primers, probes to identify similar sequences, and to make polypeptides of the present invention. The particular application of a nucleic acid molecule depends on a variety of factors, such as they are known in the art, include the length, strandedness (single stranded or double stranded and positive sense or negative sense), chemical characterization (such as DNA or RNA) or whether the nucleic acid molecule is detectably labeled or not.

[0152] II. A Polypeptide That Catalyzes the Synthesis of Polyketides Such as Bryopyran Rings and Libraries Thereof

[0153] The present invention also includes a composition including at least one polypeptide or a portion thereof that catalyzes at least one step in the synthesis of at least one polyketide bryopyran ring, wherein the at least one polypeptide or a portion thereof is derived from at least one marine organism.

[0154] A polypeptide of the present invention can be derived from at least one marine organism. A marine organism can include any organism that can be found in a marine environment, either naturally or xenotypically. A marine organism can be a vertebrate, an invertebrate or a unicellular organism, such as a fungi, algae or bacteria. Preferably, a marine organism is an invertebrate, such as a Bugula, such as Bugula neritina or Bugula pacifica, or a unicellular organism, such as a bacteria, such as a Candidatus, such as an Endobugula, such as an Endobugula sertula.

[0155] The nucleic acid molecules of the present invention can be translated to provide polypeptides. These polypeptides can be substantially purified or purified and preferably have at least one activity of a polyketide synthase, such as a PKS type I or a PKS type II, including a PKS that is involved in the synthesis of a bryopyran ring, including a bryostatin. The activity of the polypeptide can be screened and confirmed using methods known in the art, later developed or described herein. For example, antibodies that bind with active portions or fragments of polyketide synthases can be used to identify appropriate polypeptides. Alternatively, substrates for an activity, such as substrates that are detectably labeled, can be used to detect the binding of a substrate to an activity or the conversion of a substrate to a product. As set forth in the Examples and exemplified in SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28 and SEQ ID NO:38 or portions thereof or polypeptides or proteins including at least a portion thereof, polypeptides encoding PKS activity have been isolated. The PK activity of polypeptides of the present invention can be screened and confirmed using methods known in the art (see, McDaniel et al., 1999; Shen et al. 1999).

[0156] A polypeptide of the present invention can be of any length, but is preferably between about 10 amino acids and about 300,000 amino acids in length and more preferably between about 100 amino acids and about 30,000 amino acids in length or between about 1,000 amino acids and about 3,000 amino acids in length.

[0157] The polypeptide of the present invention can be made using recognized methods, such as by way of recombinant methods as they are known in the art (see, Sambrook et al., supra, (1989)) or by digesting proteins or polypeptides. For example, nucleic acid molecules encoding or suspected of encoding a polypeptide of the present invention can be cloned into expression vectors that are transformed into appropriate host cells where the nucleic acid molecules are expressed. The resulting polypeptides can be optionally purified and their activity confirmed using methods of the present invention or as they are known in the art or later developed. Alternatively, the in vivo activity of polypeptides can be confirmed using methods of the present invention or as they are known in the art.

[0158] A polypeptide of the present invention can be provided ex vivo or within a cell. A polypeptide of the present invention can be expressed within a cell by transfecting a cell with a nucleic acid molecule that encodes a polypeptide of the present invention. The nucleic acid molecule of the present invention can be operably linked to expression control sequences appropriate for the cell such that the nucleic acid molecule of the present invention is expressed on or within the cell. The nucleic acid molecule can also encode a fusion protein such that the fusion protein is expressed on or within the cell. In this instance, a fusion protein that includes a detectable label as the polypeptide of interest can be used to track the location of the fusion protein in the cell.

[0159] A polypeptide of the present invention can also be part of a fusion protein that includes a polypeptide of the present invention and a polypeptide of interest. A polypeptide of interest can be any polypeptide, but is preferably a detectable label, such as green fluorescent protein, or a sequence that aids in the purification of a polypeptide, such as FLAG. A fusion protein that includes a polypeptide of the present invention can be made from a nucleic acid that encodes a fusion protein can be made by operably linking a nucleic acid that encodes a polypeptide of interest with a polypeptide of the present invention. The operably linking can be direct or indirect, such as in the case where a linker connects the polypeptide of the present invention with a polypeptide of interest. The nucleic acid molecule of the present invention and the nucleic acid that encodes a polypeptide of interest are preferably operably linked in frame such that an operable polypeptide of the present invention and an operable polypeptide of interest are translated from the nucleic acid, but that need not be the case. The present invention also includes such fusion proteins or libraries of such fusion proteins.

[0160] The present invention also includes a library of polypeptides of the present invention. A library of polypeptides of the present invention, including fusion proteins, includes between about two, about four, about six, about eight, about ten, about thirty, about seventy, about one-hundred, about one-thousand, about ten-thousand, about one-hundred thousand or about one-million polypeptides and about three, about five, about seven, about twenty, about fifty, about five-hundred, about fifty-thousand, about five-hundred thousand and about ten million polypeptides. The members of such a library are preferably different polypeptides, but that need not be the case.

[0161] The present invention also includes antibodies that specifically bind with a polypeptide of the present invention. Such antibodies can be polyclonal or monoclonal and can be made using methods known in the art (see, Harrow, Antibodies: A Laboratory Manual, Cold Spring Harbor Press, 1988). The specificity of such antibodies can be screened and confirmed using assay formats known in the art, such as using enzyme linked immunosorbent assays (ELISAs) or other appropriate immunoassay formats.

[0162] III. Method of Making Polyketides Such as Bryopyran Rings Using Nucleic Acids or Polypeptides of the Present Invention and Compositions Made Thereby

[0163] The present invention also includes a method of making a composition including providing at least one nucleic acid molecule or polypeptide of the present invention, and synthesizing at least one polyketide or precursor thereof, such as a bryopyran ring, such as a bryostatin.

[0164] At least one nucleic acid molecule of the present invention or at least one polypeptide of the present invention can be expressed and used in a system to synthesize a polyketide or precursor thereof, including a bryopyran ring, such as a bryostatin. The polyketides or precursors thereof can be previously known or unknown polyketides or precursors thereof. A variety of methods of producing polyketides or precursors thereof using known PKS genes, in particular known PKS type I genes, have been reported. (see, U.S. Pat. No. 5,672,491 to Khosla et al., issued Sep. 30, 1997; U.S. Pat. No. 5,712,146 to Khosla et al., issued Jan. 27, 1998; U.S. Pat. No. 5,716,849 to Ligon et al., issued Feb. 10, 1998; U.S. Pat. No. 5,744,350 to Vinci et al., issued Apr. 28, 1998; U.S. Pat. No. 5,783,431 to Peterson et al., issued Jul. 21, 1998; U.S. Pat. No. 5,824,513 to Katz et al., issued Oct. 20, 1998; U.S. Pat. No. 5,830,750 to Khosla et al., issued Nov. 3, 1998; U.S. Pat. No. 5,843,718 to Khosla et al., issued Dec. 1, 1998; U.S. Pat. No. 5,849,541 to Vinci et al., issued Dec. 15, 1998; U.S. Pat. No. 5,876,991 to DeHoff et al., issued Mar. 2, 1999; WO 93/13663 to Katz et al., published Jul. 22, 1993; WO 95/12661 to Vinci et al., published May 11, 1995; WO 97/22711 to Sherman et al., published Jun. 26, 1997; WO 98/01546 to Leadlay et al., published Jan. 15, 1998; WO 98/11230 to Bloom et al., published Mar. 19, 1998; WO 98/27203 to Barr et al., published Jun. 25, 1998; WO 98/49315 to Khosla et al., published Nov. 5, 1998; WO 98/53097 to Waters et al., published Nov. 26, 1998; WO 99/02669 to Betlach, published Jan. 21, 1999; EP 791,655 A2 to Dehoff et al., Published Aug. 27, 1997; EP 791,656 A2 to Burgett et al., published Aug. 27, 1997).

[0165] The present invention utilizes at least one nucleic acid molecules of the present invention and/or at least one polypeptides of the present invention in such methods to make known or novel polyketides, including bryopyran rings and bryostatins. The present invention can utilize at least one nucleic acid molecule of the present invention and/or at least one polypeptide of the present invention alone or in combination with other PKS polypeptides or PKS genes, such as PKS type I polypeptides or PKS type I genes. These polypeptides and genes can be known or later developed and can be from any type of PKS, including PKS derived from marine, aquatic or terrestrial organisms. For example, a PKS can be an aromatic PKS system, a modular PKS system, a fungal PKS system or modified forms thereof (see, WO 98/27203 to Barr et al., published Jun. 25, 1998).

[0166] For example, some methods of synthesizing polyketides provide cassettes that include a PKS type I gene complex, either in whole or in part. Nucleic acid molecules of the present invention can be inserted into such cassettes randomly or non-randomly, including replacing identified PKS type genes. Random integration can be accomplished using the methods of Whitney et al., (WO 98/13353, published Apr. 2, 1998) and non-random integration can be accomplished using the methods of Smith et al., (WO 94/24301, published Oct. 27, 1994). When nucleic acid molecules of the present invention are inserted non-randomly into such cassettes, they can be inserted in-frame to replace PKS genes that encode polypeptides that have functions similar to the polypeptide encoded by a nucleic acid of the present invention. The nucleic acid molecules of the present invention are thus expressed as polypeptides of the present invention, which can act as part of a PKS complex to produce known or novel polyketides, such as bryopyran rings including byrostatins.

[0167] Cells or extracts thereof (such as substantially purified extracts) that include one or more of the nucleic acid molecules of the present invention or one or more polypeptides of the present invention can be used to synthesize a wide variety of polyketides, including bryopyran rings and bryostatins. Such cells or extracts thereof can be contacted with a variety of compounds, including substrates for PKS activity, particularly PKS activity present in the cells or extracts thereof. Polypeptides expressed from nucleic acids of the present invention can act on these compounds in order to make a wide variety of polyketides such as bryopyran rings including bryostatins. In one aspect of the present invention, more than one cell and/or extract thereof can be used in combination or sequentially such that the products made by combination of cells or extracts can be determined and its activity confirmed.

[0168] The present invention also includes compounds made or identified using the present invention. For example, the present invention includes polyketides, bryopyran rings and bryostatins made using at least one method of the present invention. A compound made or identified using a method of the present invention can be a novel or non-novel compound. For example, a compound of the present invention can optionally not include a compound that was not novel on the date of the filing of the present application, or one year or six months prior to the filing date of the present application. In this aspect of the present invention, the compound of the present invention preferably does not include a known bryostatin.

[0169] A compound of the present invention can be provided with at least one pharmaceutically acceptable carrier as they are known in the art and discussed herein. Such pharmaceutically acceptable carriers are known in the art and are disclosed herein. A compound of the present invention can also be a pharmaceutical composition.

[0170] IV. Method of Identifying Nucleic Acid Molecules, Nucleic Acid Molecules Identified Thereby and Libraries of Nucleic Acid Molecules

[0171] The present invention also includes a method for identifying at least one nucleic acid molecule encoding at least one activity of a PKS including contacting a nucleic acid molecule of the present invention with a sample, and identifying nucleic acid molecules in said sample that hybridize with said nucleic acid molecule of the present invention. This aspect of the present invention utilizes nucleic acid molecules of the present invention as probes or PCR primers in order to identify nucleic acid molecules that have or are expected to encode polypeptides that have PKS activity.

[0172] Samples for use in the present invention can be from any source that can include a nucleic acid molecule, but preferably include samples from an environmental sample, such as the marine environment. The samples can include marine organisms, including invertebrates or vertebrates or any other marine organism. Preferably, the sample includes single celled organisms, such as bacteria. More preferably, the sample includes samples that are expected to contain a polyketide, such as a bryopyran including bryostatins. Such samples include, for example, Bugula species, including Bugula neritina and Bugula pacifica.

[0173] When used as probes, a nucleic acid molecule of the present invention can be detectably labeled and contacted with a sample. Nucleic acid molecules that bind with the nucleic acid of the present invention can be identified, cloned and sequenced using methods known in the art. The identified nucleic acid molecules can be operably linked to expression control sequences such that a polypeptide encoded by the identified nucleic acid molecule can be made and characterized.

[0174] When used as PCR primers, the nucleic acid molecules of the present invention can be used to amplify nucleic molecules in a sample. The amplified nucleic acid molecules are presumptively derived from a PKS gene. The amplified nucleic acid molecules can be identified, cloned and sequenced using methods known in the art. The amplified nucleic acid molecules can be operably linked to expression control sequences such that a polypeptide encoded by the amplified nucleic acid molecule can be made and characterized.

[0175] The nucleic acid molecules of the present invention can also be used to identify nucleic acid molecules that are upstream or downstream from a targeted segment of a PKS gene. A nucleic acid molecule that encodes a conserved region of a PKS gene can be used in primer extension or inverse PCR methods such that upstream or downstream segments from the point of hybridization are identified. These extended segments can be identified, cloned and sequenced using methods known in the art. The extended segments can be operably linked to expression control sequences such that a polypeptide encoded by the extended segment can be made and characterized.

[0176] Whether the nucleic acid molecule of the present invention is used as a probe, primer or PCR primer, the methods of the present invention identify nucleic acid molecules that presumptively encode at least a portion of a PKS gene. Any of these processes can be used alone, in combination or reiteratively to identify at least portions of PKS genes in a sample.

[0177] The present invention also includes nucleic acid molecules identified by the present invention. The identified nucleic acid molecules can include expression control sequences operably linked to the identified nucleic acid molecules. Such constructs can be used to make polypeptides encoded by the identified nucleic acid molecules and the polypeptides can be characterized as to a variety of structures and functions, particularly structures and functions associated with PKS genes. The present invention includes a library of nucleic acids, cells or polypeptides identified by the present invention.

[0178] V. Method of Identifying a Bioactive Compound, Bioactive Compounds, and Therapeutic Compositions

[0179] The present invention also includes a method for identifying a bioactive compound including contacting a compound made or identified by the present invention with at least one in vitro, ex vivo or in vivo assay system and determining the bioactivity of said compound. The present invention includes bioactive compounds identified using this method. The identified bioactive compounds can be provided in a pharmaceutically acceptable carrier and can be a pharmaceutical compound.

[0180] In vitro, ex vivo and in vivo systems used in the present invention are preferably those known in the art for a bioactivity to be identified. The assay chosen to be used in this method is related to a bioactivity that is being screened for. Preferred systems include those that determine at least one PKS activity, such as one activity of a bryopyran ring, including at least one activity of a bryostatin. For example, in vitro systems (systems that do not use whole organisms or whole cells) and ex vivo systems (systems that use whole cells or portions of cells) for the identification of polyketide, bryopyran ring or bryostatin activity are known in the art (see DeVries et al., 1988). In vivo systems (systems that use whole organisms or tissues or organs derived therefrom) are also known in the art (see DeVries et al., supra, 1988). Compounds that are identified using these methods as having a desired bioactivity. Compounds identified by these methods are bioactive compounds that have at least one bioactivity.

[0181] Screening of Compounds for Activities

[0182] The following assays can be performed to confirm the bioactivity of a compound:

[0183] a) antimicrobial effect on S. aureus by placing a compound on a paper disk and determining the ability of the compound to inhibit the growth of the S. aureus (Benson, Microbial Applications, 6^(th) Ed. Wm. C. Brown Publishers, Dubuque, Iowa (1994)). The results of this assay establish the toxicity of the compound towards Gram-positive bacteria.

[0184] b) antimicrobial effect on E. coli by placing a compound on a paper disk and determining the ability of the compound to inhibit the growth of the E. coli (Benson, Microbial Applications, 6^(th) Ed. Wm. C. Brown Publishers, Dubuque, Iowa (1994)). The results of this assay establish the toxicity of a compound towards Gram-negative bacteria.

[0185] c) antimicrobial effect on Candida albicans by placing a compound on a paper disk and determining the ability of the extract to inhibit the growth of Candida albicans (Benson, Microbial Applications, 6^(th) Ed. Wm. C. Brown Publishers, Dubuque, Iowa (1994)). The results of this study establish the toxicity of compounds towards yeasts and fungi.

[0186] d) multi drug resistance assay using S. aureus by placing a compound on a paper disk and determining the ability of the compound to inhibit the growth of S. aureus exhibiting methicillin resistance (clinical isolates provided by University of California, San Diego, Medical Center, #12144G) (Benson, Microbial Applications, 6^(th) Ed. Wm. C. Brown Publishers, Dubuque, Iowa (1994)). The results of this assay establish that bacteria that are resistant to methicillin are not resistant to the antibacterial effect of the compound.

[0187] e) multi-drug resistance assay using E. faecalis by placing a compound on a paper disk and determining the ability of the compound to inhibit the growth of E. faecalis exhibiting vancomycin resistance (clinical isolates provided by University of California, San Diego, Medical Center, #8673G) (Benson, Microbial Applications, 6^(th) Ed. Wm. C. Brown Publishers, Dubuque, Iowa (1994)). The results of this assay establish that bacteria that are resistant to vancomycin are not resistant to the antibacterial effect of the compound.

[0188] f) inhibition of the growth of cancer cells by contacting a compound with the National Cancer Institute's (NCI) cell line screen (approximately sixty cell lines) against up to fifty-one cancer cell types in vitro (Boyd et al., Drug Development Research, 34:91-109 (1995)). These results provide an activity profile for the compound. The activity profiles of an extract can be compared to the activity profiles of other samples in the NCI database of activity profiles. Similar activity profiles of different extracts, including known extracts with known modes of action, strongly suggests that the samples have similar modes of action. A novel activity profile strongly suggests that the compound has a novel mechanism of action.

[0189] g) immunomodulatory effects of a compound on the immune system in vitro or in vivo. For example, the modulation, increase or decrease of the activity of the cellular immune response, humoral immune response, or both, can be measured using methods known in the art. For example, T-cell response and B-cell response can be monitored by the type and amount of cytokines produced by a population of cells (such as T_(H)1 and T_(H)2 profiles), cytotoxic T-cell response can be determined using chromium release assays, B-cell response can be measured using immunoassays to detect the presence of specific antibodies, histamine release can be used to detect the presence and activity of mast cells, and the presence of absence of cell surface markers, such as CD4, CD3 or CD34 can be used to detect the presence or amount of cell populations in a sample. A wide variety of such tests are known in the art. (see, for example, Harrow, Antibodies: a Laboratory Manual, Cold Spring Harbor Press (1988); Roitt et al., Immunology, Third Edition, Mosby, St. Louis (1993); Zing et al., Biochem J. 319 (Pt. 1):159-165 (1996); Hess et al., J. Immunol. 141:3263-3269 (1998)). The results of these types of assays establish the immunomodulatory effects of a compound.

[0190] h) anti-inflammatory effects of a compound can be measured in vitro or in vivo. For example, animal models for such anti-inflammatory effects of compounds, such as the rabbit knee or mouse ear, can be used. In addition, the specificity of T-cell responses and B-cell responses in an inflammatory mode can be monitored by monitoring the type and response of such cells and their cytokine profiles (Bardley et al., J. Invest. Dermatology 78:206-209 (1982)). The results of these types of assays establish the anti-inflammatory activity of a compound.

[0191] i) radiation protective effects of a compound can be measured in vitro or in vivo. For example, cells in culture, a whole animal, including humans, or a portion of an animal can be exposed to a variety of doses of a compound before or after being exposed to a variety of doses and types of radiation, including ionizing radiation, preferably a dose and type of radiation used to treat cancers or tumors. The ability of the compound to protect the cells or animal can be measured using methods known in the art. For example, the ability of the cells or animal to survive longer, or the ability of the cells or animal to be in a healthier state, when treated with an extract indicates that the compound has a radiation protective effect (Grant et al., Blood 83:663-667 (1994)). The results of these types of assays establish the radiation protective activity of an compound.

[0192] j) PKC modulating effects of a compound can be measured by contacting a compound with a sample including at least a portion of a PKC, including a cell, and determining the modulation of PKC activity or the binding of the compound to PKC. Such methods are known in the art (see, DeVries 1988).

[0193] k) cytotoxic activity of a compound can be determined by a variety of methods, including inhibition of brine shrimp by contacting twenty-four hour old brine shrimp nauplii for twenty-four hours with an compound and observing the inhibition of the activity or viability of the brine shrimp. The results of this assay establish the cytotoxicity of the compound towards whole organisms.

[0194] Compounds identified as having a bioactivity have presumptive therapeutic activity. Such therapeutic activity and related pharmacological parameters can be confirmed using the methods discussed herein.

[0195] Pharmacology and Toxicity of Bioactive Compounds and Bioactivities

[0196] The structure of a bioactive compound or bioactivity can be determined or confirmed by methods known in the art, such as mass spectroscopy. For bioactive compounds and bioactivities stored for extended periods of time under a variety of conditions, the structure, activity and potency thereof can be confirmed.

[0197] Identified bioactive compounds and bioactivities can be evaluated for a particular activity using are-recognized methods and those disclosed herein. For example, if an identified bioactive compound or bioactivity is found to have anticancer cell activity in vitro, then the bioactive compound or bioactivity would have presumptive pharmacological properties as a chemotherapeutic to treat cancer. Such nexuses are known in the art for several disease states, and more are expected to be discovered over time. Based on such nexuses, appropriate confirmatory in vitro and in vivo models of pharmacological activity, and toxicology, and be selected and performed. The methods described herein can also be used to assess pharmacological selectivity and specificity, and toxicity.

[0198] Identified bioactive compounds and bioactivities can be evaluated for toxicological effects using known methods (see, Lu, Basic Toxicology, Fundamentals, Target Organs, and Risk Assessment, Hemisphere Publishing Corp., Washington (1985); U.S. Pat. Nos; 5,196,313 to Culbreth (issued Mar. 23, 1993) and 5,567,952 to Benet (issued Oct. 22, 1996)). For example, toxicology of a bioactive compound or bioactivity can be established by determining in vitro toxicity towards a cell line, such as a mammalian, for example human, cell line. Bioactive compounds and bioactivities can be treated with, for example, tissue extracts, such as preparations of liver, such as microsomal preparations, to determine increased or decreased toxicological properties of the bioactive compound of bioactivity after being metabolized by a whole organism. The results of these types of studies are predictive of toxicological properties of chemical s in animals, such as mammals, including humans.

[0199] Alternatively, or in addition to these in vitro studies, the toxicological properties of a bioactive compound or bioactivity in an animal model, such as mice, rats, rabbits, dogs or monkeys, can be determined using established methods (see, Lu, supra (1985); and Creasey, Drug Disposition in Humans, The Basis of Clinical Pharmacology, Oxford University Press, Oxford (1979)). Depending on the toxicity, target organ, tissue, locus and presumptive mechanism of the bioactive compound or bioactivity, the skilled artisan would not be burdened to determine appropriate doses, LD₅₀ values, routes of administration and regimes that would be appropriate to determine the toxicological properties of the bioactive compound or bioactivity. In addition to animal models, human clinical trials can be performed following established procedures, such as those set forth by the United States Food and Drug Administration (USFDA) or equivalents of other governments. These toxicity studies provide the basis for determining the efficacy of a bioactive compound or bioactivity in vivo.

[0200] Efficacy of Bioactive Compounds and Bioactivities

[0201] Efficacy of a bioactive compound or bioactivity can be established using several art recognized methods, such as in vitro methods, animal models or human clinical trials (see, Creasey, supra (1979)). Recognized in vitro models exist for several diseases or conditions. For example, the ability of a compound or composition to extend the life-span of HIV-infected cells in vitro is recognized as an acceptable model to identify chemicals expected to be efficacious to treat HIV infection or AIDS (see, Daluge et al., Antimicro. Agents Chemother. 41:1082-1093 (1995)). Furthermore, the ability of cyclosporin A (CsA) to prevent proliferation of T-cells in vitro has been established as an acceptable model to identify chemicals expected to be efficacious as immunosuppressants (see, Suthanthiran et al., supra (1996)). For nearly every class of therapeutic, disease or condition, an acceptable in vitro or animal model is available. In addition, these in vitro methods can use tissue extracts, such as preparations of liver, such as microsomal preparations, to provide a reliable indication of the effects of metabolism on a bioactive compound or bioactivity. Similarly, acceptable animal models can be used to establish efficacy of bioactive compounds and bioactivities to treat various diseases or conditions. For example, the rabbit knee is an accepted model for testing agents for efficacy in treating arthritis (see, Shaw and Lacy, J. Bone Joint Surg. (Br.) 55:197-205 (1973)). Hydrocortisone, which is approved for use in humans to treat arthritis, is efficacious in this model which confirms the validity of this model (see, McDonough, Phys. Ther. 62:835-839 (1982)). When choosing an appropriate model to determine efficacy of bioactive compounds and bioactivities, the skilled artisan can be guided by the state of the art to choose an appropriate model, doses and route of administration, regime and endpoint and as such would not be unduly burdened.

[0202] In addition to animal models, human clinical trials can be used to determine the efficacy of bioactive compounds and bioactivities. The USFDA, or equivalent governmental agencies, have established procedures for such studies.

[0203] Selectivity of Bioactive Compounds and Bioactivities

[0204] The in vitro and in vivo methods described above also establish the selectivity of a candidate modulator. It is recognized that chemicals can modulate a wide variety of biological processes or be selective. Panels of cells as they are known in the art can be used to determine the specificity of the a bioactive compound or bioactivity (WO 98/13353 to Whitney et al., published Apr. 2, 1998). Selectivity is evident, for example, in the field of chemotherapy, where the selectivity of a chemical to be toxic towards cancerous cells, but not towards non-cancerous cells, is obviously desirable. Selective modulators are preferable because they have fewer side effects in the clinical setting. The selectivity of a bioactive compound or bioactivity can be established in vitro by testing the toxicity and effect of a bioactive compound or bioactivity can be established in vitro by testing the toxicity and effect of a bioactive compound or bioactivity on a plurality of cell lines that exhibit a variety of cellular pathways and sensitivities. The data obtained form these in vitro toxicity studies can be extended to animal model studies, including human clinical trials, to determine toxicity, efficacy and selectivity of a bioactive compound or bioactivity.

[0205] The selectivity, specificity and toxicology, as well as the general pharmacology, of a bioactive compound or bioactivity can be often improved by generating additional test chemicals based on the structure/property relationship of a bioactive compound or bioactivity originally identified as having activity. Bioactive compounds and bioactivities can be modified to improve various properties, such as affinity, life-time in blood, toxicology, specificity and membrane permeability. Such refined bioactive compounds and bioactivities can be subjected to additional assays as they are known in the art or described herein. Methods for generating and analyzing such compounds or compositions are known in the art, such as U.S. Pat. No. 5,574,656 to Agrafiotis et al.

[0206] Pharmaceutical Compositions

[0207] The present invention also encompasses a bioactive compound or bioactivity in a pharmaceutical composition comprising a pharmaceutically acceptable carrier prepared for storage and preferably subsequent administration, which have a pharmaceutically effective amount of the bioactive compound or bioactivity in a pharmaceutically acceptable carrier or diluent. Acceptable carriers or diluents for therapeutic use are well known in the pharmaceutical art, and are described, for example, in Remington's Pharmaceutical Sciences, Mack Publishing Co., (A. R. Gennaro edit. (1985)). Preservatives, stabilizers, dyes and even flavoring agents can be provided in the pharmaceutical composition. For example, sodium benzoate, sorbic acid and esters of p-hydroxybenzoic acid can be added as preservatives. In addition, antioxidants and suspending agents can be used.

[0208] The bioactive compounds and bioactivities of the present invention can be formulated and used as tablets, capsules or elixirs for oral administration; suppositories for rectal administration,; sterile solutions, suspensions or injectable administration; and the like. Injectables can be prepared in conventional forms either as liquid solutions or suspensions, solid forms suitable for solution or suspension in liquid prior to injection, or as emulsions. Suitable excipients are, for example, water, saline, dextrose, mannitol, lactose, lecithin, albumin, sodium glutamate, cysteine hydrochloride and the like. In addition, if desired, the injectable pharmaceutical compositions can contain minor amounts of nontoxic auxiliary substances, such as wetting agents, pH buffering agents and the like. If desired, absorption enhancing preparation, such as liposomes, can be used.

[0209] The pharmaceutically effective amount of a bioactive compound or bioactivity required as a dose will depend on the route of administration, the type of animal or patient being treated, and the physical characteristics of the specific animal under consideration. The dose can be tailored to achieve a desired effect, but will depend on such factors as weight, diet, concurrent medication and other factors which those skilled in the medical arts will recognize. In practicing the methods of the present invention, the pharmaceutical compositions can be used alone or in combination with one another, or in combination with other therapeutic or diagnostic agents. These products can be utilized in vivo, preferably in a mammalian patient, preferably in a human, or in vitro. In employing them in vivo, the pharmaceutical compositions can be administered to the patient in a variety of ways, including parenterally, intravenously, subcutaneously, intramuscularly, colonically, rectally, nasally or intraperiotoneally, employing a variety of dosage forms. Such methods can also be used in testing the activity of bioactive compounds or bioactivities in vivo.

[0210] As will be readily apparent to one skilled in the art, the useful in vivo dosage to be administered and the particular mode of administration will vary depending upon the age, weight and type of patient being treated, the particular pharmaceutical composition employed, and the specific use for which the pharmaceutical composition is employed. The determination of effective dosage levels, that is the dose levels necessary to achieve the desired result, can be accomplished by one skilled in the art using routine methods as discussed above. Typically, human clinical applications of products are commenced at lower dosage levels, with dosage level being increased until the desired effect is achieved. Alternatively, acceptable in vitro studies can be used to establish useful doses and routes of administration of the bioactive compounds and bioactivities.

[0211] In non-human animal studies, applications of the pharmaceutical compositions are commenced at higher dose levels, with the dosage being decreased until the desired effect is no longer achieved or adverse side effects are reduced or disappear. The dosage for the bioactive compounds and bioactivities of the present invention can range broadly depending upon the desired affects, the therapeutic indication, route of administration and purity and activity of the bioactive compound or bioactivity. Typically, dosages can be between about 1 ng/kg and about 10 ng/kg, preferably between about 10 ng/kg and about 1 mg/kg, more preferably between about 100 ng/kg and about 100 micrograms/kg, and most preferably between about 1 microgram/kg and about 10 micrograms/kg.

[0212] The exact formulation, route of administration and dosage can be chosen by the individual physician in view of the patient's condition (see, Fingle et al., in The Pharmacological Basis of Therapeutics (1975)). It should be noted that the attending physician would know how to and when to terminate, interrupt or adjust administration due to toxicity, organ dysfunction or other adverse effects. Conversely, the attending physician would also know to adjust treatment to higher levels if the clinical response were not adequate. The magnitude of an administrated does in the management of the disorder of interest will vary with the severity of the condition to be treated and to the route of administration. The severity of the condition may, for example, be evaluated, in part, by standard prognostic evaluation methods. Further, the dose and perhaps dose frequency, will also vary according to the age, body weight and response of the individual patient, including those for veterinary applications.

[0213] Depending on the specific conditions being treated, such pharmaceutical compositions can be formulated and administered systemically or locally. Techniques for formation and administration can be found in Remington's Pharmaceutical Sciences, 18th Ed., Mack Publishing Co., Easton, Pa. (1990). Suitable routes of administration can include oral, rectal, transdermal, otic, ocular, vaginal, transmucosal or intestinal administration; parenteral delivery, including intramuscular, subcutaneous, intramedullary injections, as well as intrathecal, direct intraventricular, intravenous, intraperitoneal, intranasal, or intraocular injections.

[0214] For injection, the pharmaceutical compositions of the present invention can be formulated in aqueous solutions, preferably in physiologically compatible buffers such as Hanks' solution, Ringer's solution or physiological saline buffer. For such transmucosal administration, penetrans appropriate to the barrier to be permeated are used in the formulation. Such penetrans are generally known in the art. Use of pharmaceutically acceptable carriers to formulate the pharmaceutical compositions herein disclosed for the practice of the invention into dosages suitable for systemic administration is within the scope of the invention. With proper choice of carrier and suitable manufacturing practice, the compositions of the present invention, in particular, those formulation as solutions, can be administered parenterally, such as by intravenous injection. The pharmaceutical compositions can be formulated readily using pharmaceutically acceptable carriers well known in the art into dosages suitable for oral administrations. Such carriers enable the bioactive compounds and bioactivities of the invention to be formulated as tables, pills, capsules, liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a patient to be treated.

[0215] Agents intended to be administered intracellularly may be administered using techniques well known to those of ordinary skill in the art. For example, such agents may be encapsulated into liposomes, then administered as described above. Substantially all molecules present in an aqueous solution at the time of liposome formation are incorporated into or within the liposomes thus formed. The liposomal contents are both protected from the external micro-environment and, because liposomes fuse will cell membranes, are efficiently delivered into the cell cytoplasm. Additionally, due to their hydrophobicity, small organic molecules can be directly administered intracellularly.

[0216] Pharmaceutical compositions suitable for use in the present invention include compositions wherein the active ingredients are contained in an effective amount to achieve its intended purpose. Determination of the effective amount of a pharmaceutical composition is well within the capability of those skilled in the art, especially in light of the detailed disclosure provided herein. In addition to the active ingredients, these pharmaceutical compositions can contain suitable pharmaceutically acceptable carriers comprising excipients and auxiliaries which facilitate processing of the active chemicals into preparations which can be used pharmaceutically. The preparations formulated for oral administration may be in the form of tables, dragees, capsules or solutions. The pharmaceutical compositions of the present invention can be manufactured in a manner that is itself known, for example by means of conventional mixing, dissolving, granulating, dragee-making, emulsifying, encapsulating, entrapping or lyophilizing processes. Pharmaceutical formulations for parenteral administration include aqueous solutions of active chemicals in water-soluble form.

[0217] Additionally, suspensions of the active chemicals may be prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or vehicles include fatty oils such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or triglycerides or liposomes. Aqueous injection suspensions may contain substances what increase the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol or dextran. Optionally, the suspension can also contain suitable stabilizers or agents that increase the solubility of the chemicals to allow for the preparation of highly concentrated solutions.

[0218] Pharmaceutical compositions for oral use can be obtained by combining the active chemicals with solid excipient, optionally grinding a resulting mixture, and processing the mixture of granules, after adding suitable auxiliaries, if desired, to obtain tables or dragee cores. Suitable excipients are, in particular, fillers such as sugars, including lactose, sucrose, mannitol or sorbitol; cellulose preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, gum tragacanth, methyl cellulose, hydroxypropylmethylcellulose, sodium carboxymethylcellulose and/or polyvinylpyrrolidone. If desired, disintegrating agents can be added, such as the cross-linked polyvinyl pyrolidone, agar, alginic acid or a salt thereof such as sodium alginate. Dragee cores can be provided with suitable coatings. Dyes or pigments can be added to the tablets or dragee coatings for identification or to characterize different combinations of active doses.

[0219] The bioactive compounds and bioactivities of the present invention, and pharmaceutical compositions that include such bioactive compounds and bioactivities are useful for treating a variety of ailments in a patient, including a human. As set forth in the Examples, the bioactive compounds and bioactivities of the present invention have antibacterial, antimicrobial, antiviral, anticancer cell, antitumor and cytotoxic activity. A patient in need of such treatment can be provided a bioactive compound or bioactivity of the present invention, preferably in a pharmacological composition in an effective amount to reduce the number or growth rate of bacteria, microbes, cancer cells or tumor cells in said patent, or to reduce the infectivity of viruses in said patient. The amount, dosage, route of administration, regime and endpoint can all be determined using the procedures described herein.

[0220] VI. Bacterial Symbionts of Bugula

[0221] The present invention includes a preparation of at least one bacterial symbiont of B. neritina or B. pacifica, wherein said bacterial symbiont comprises at least one polypeptide that has at least one PKS activity. The preparation of at least one bacterial symbiont can be substantially free of its host Bugula and can be provided as an isolated preparation or isolated culture (only one type of bacteria) or a mixed preparation or mixed culture (more than one type of bacteria. As set forth in the Examples, bacterial symbionts of Bugula neritina and Bugula pacifica are involved in the production of polyketides included bryopyran rings and bryostatins.

[0222] VII. Polyketides, Bryopyrans and Bryostatins from Bugula pacifica and Symbionts

[0223] The present invention also includes at least one bioactive compound present in a Bugula pacifica. The bioactive compound is preferably a polyketide, a bryopyran ring or a bryostatin. The composition preferably has at least one activity of at least one bryostatin, which can be confirmed using methods of the present invention. The composition can be made by isolating the composition from B. pacifica using methods of the present invention, such as by extraction from B. pacifica using appropriate solvents, such as ethanol. The extracts can be separated to obtain pure or substantially pure compounds using methods such as HPLC. The bioactivity of these compounds, either alone or in combination, can be confirmed using methods of the present invention. Such compounds can be provided in a pharmaceutically acceptable carrier and can be provided as a pharmaceutical composition.

EXAMPLES Example 1 Extraction of Nucleic Acid Molecules From Samples of B. neritina That Include Symbionts

[0224] Unless otherwise noted, molecular biology procedures used are standard techniques used in the field (Sambrook et al., 1989).

[0225] DNA Extraction

[0226] Samples of either adult or larval B. neritina were obtained from the waters of southern California. Free-swimming larvae were collected in the lab from adult B. neritina colonies. Larval DNA was extracted as previously described (Haygood and Davidson, 1997). Larvae were concentrated in 1.5 milliliter microcentrifuge tubes (approximately 25 mg of larvae per tube) by gentle centrifugation and then rinsed four times in filtered seawater (0.2 micrometer pore size filter) to minimize the contaminating seawater bacteria. Excess water was removed and the pellets were frozen at −80° C. for later use. DNA was extracted from the B. neritina larvae using a QIAamp Tissue Kit as directed by the manufacturer (Qiagen Inc., Valencia, Calif.).

[0227] Adult B. neritina total DNA was extracted using a modification of the method of Shure et al. (Shure et al., 1983). Briefly, two grams of fresh adult B. neritina was frozen on dry ice and pulverized in a mortar. The powdered tissue was taken up in 6 ml of extraction buffer containing 8 M urea, 0.35 M NaCl, 0.05 M Tris-HCl (pH 7.5), 0.02 M EDTA and 2% sarcosyl. An equal volume of phenol/chloroform/isoamyl alcohol (25:24:1), pH 8.0 was added and mixed thoroughly. After 15 minutes at room temperature, the solution was again thoroughly mixed and centrifuged at 8,000 rpm for 10 minutes at 4° C. to separate the phases. The supernatant was transferred to a clean tube and a second phenol/chloroform/isoamyl alcohol extraction was performed. The supernatant was transferred to a clean tube and the salt concentration was increased by the addition of {fraction (1/10)} volume 3 M sodium acetate, pH 5.2 and a volume of isopropanol equal to the new total volume. The solution was mixed well and centrifuged at 8,000 rpm for 10 min to collect the pellet. The pellet was washed with 70% ethanol and air dried for 10 min. The DNA was dissolved in approximately ⅕ the original lysis volume of 10 mM Tris-HCl (pH 7.5), 10 mM EDTA. The DNA was subsequently run through a Sephadex G-200 spin column (Maloy et al., 1996) to remove any PCR inhibitors.

Example 2 Identification of PKS Genes in Samples of B. neritina That Include Symbionts

[0228] Primer Design

[0229] Degenerate PCR primers (PKSR and BLCASPKS) were designed based on conserved regions of the beta-ketoacyl synthase (KS) domain of PKS-I genes. TABLE I PCR primers designed for this study. Primer Designation SEQ and ID Orientation Sequence NO. PKSR ACR TGI GCR TTI GTI CC SEQ (forward) ID NO: 1 BLCASPKS ICA YGG IAC IGG IAC SEQ (reverse) ID NO: 2 SWA38R ACG GAC AAG CGT CAT TAC SEQ (forward) ID NO: 3 SWA38L GCC AAG GCT TTA ATT CCG SEQ (reverse) ID NO: 4 SWA38F3 GTT GTC TTT GCA GCA TCG CAT GTT ACC SEQ (forward) AC ID NO: 5 SWA38R3 CAC GCC CGC TAT CCC AGC ACC TAC C SEQ (reverse) ID NO: 6

[0230] The PCR conditions for the initial amplification of the KS genes from B. neritina DNA were as follows. A total reaction volume of 50 microliter contained approximately 100 ng of B. neritina DNA (either adult or larval), 1 micromolar each primer (PKSR and BLCASPKS), and Taq polymerase and buffer (Boehringer Mannheim Corp., Indianapolis, Ind.). A PCR protocol was optimized for the degenerate KS primers PKSR and BLCASPKS. The cycle conditions started with a “touch down” sequence, which lowered the annealing temperature from 60 to 40° C. at a rate of 2° C. per cycle (11 cycles), and were then maintained at 40° C. for a total of 51 cycles. Cycle steps were as follows: denaturation (94° C.; 1 min), annealing (60° C. to 40° C.; 2 min), and extension (72° C.; 1 min). PCR conditions for the KSa specific primers (SWA38R and SWA38L) were as follows: denaturation (94° C.; 1 min), annealing (60° C.; 1 min), and extension (72° C.; 2 min).

[0231] Inverse PCR

[0232] To obtain flanking DNA sequence, the most prevalent clone, KSa, was extended using inverse PCR (Ochman et al., 1990). Adult B. neritina DNA was digested with the restriction enzyme Sau3A1 and then religated using T4 DNA ligase (Stratagene, La Jolla, Calif.). KSa specific primers (SWA 38F3 and SWA 38R3) flanking the Sau3A1 restriction site, were used to PCR amplify from the ligation reaction using TaqPlusLong polymerase (Stratagene, La Jolla, Calif.) as recommended by the manufacturer using the PCR conditons listed above for primers SWA38R and SWA38L.

[0233] DNA Cloning

[0234] PCR reactions were electrophoresed on 0.8% agarose gel and visualized with ethidium bromide. PCR products of approximately 300 bp were cloned using a TOPO TA Cloning kit into Invitrogen pCR® 2.1-TOPO vector as described by the manufacturer (Invitrogen Corp., Carlsbad, Calif.). Recombinant clones containing insert DNA were then sequenced using standard protocols. Since PKS-I enzymes are modular, clones from the degenerate PCR primers represents a pool of fragments from different KS domains.

[0235] DNA Sequencing

[0236] Plasmid DNA for sequencing was prepared using the Qiagen QIAprep Spin Miniprep Kit (Qiagen, Inc., Valencia, Calif.). All sequencing was performed with an ABI automated sequencer (model 373A) by using a PRISM Ready Reaction DyeDeoxy terminator cycle sequencing kit as recommended by the manufacturer (Perkin-Elmer). Cloned genes were sequenced using primers directed against the cloning vector, pCR® 2.1-TOPO (Invitrogen, Carlsbad, Calif.).

Example 3 Sequences of PKS from B. neritina That Include Symbionts

[0237] Degenerate primers (PKSR and BLCASPKS) were designed to conserved regions of the KS domains of the bacterial PKS-I. These were used in a step down PCR to amplify a 300 bp fragment from B. neritina DNA. Since PKS-I enzymes are modular, these PCR products are a pool of fragments from different PKS modules, and were cloned before sequencing. Two clone libraries were prepared, one from adult B. neritina DNA and one from larval DNA. Twenty-seven clones have been sequenced (Table II). TABLE II cDNA Clones obtained from B. neritina Clone Designation Number of Isolates Source (Larval or Adult) KSa 13  Larval KSb 3 Larval and Adult KSc 2 Adult KSd 2 Adult KSe 3 Larval KSf 1 Adult KSg 1 Adult KSh 1 Larval KSi 1 Adult

[0238] Nine unique clones have been identified (KSa-KSi) (Table III). One clone (KSb) appeared in both libraries. Cloned DNA sequences were identified by using the BLAST (basic local alignment search tool) server of the National Center for Biotechnology Information accessed over the Internet (Altschul et al., 1997). All of these sequences have signature regions for KS and show highest similarity in BLAST searches to bacterial PKS-I showing that they are in fact of PKS-I origin. TABLE III Sequences of (KSa-KSi) clones obtained from B. neritina and predicted amino acid products SEQ ID NO. for clones SEQ ID NO. for amino acids SEQ ID NO: 9 SEQ ID NO: 10 SEQ ID NO: 13 SEQ ID NO: 14 SEQ ID NO: 15 SEQ ID NO: 16 SEQ ID NO: 17 SEQ ID NO: 18 SEQ ID NO: 19 SEQ ID NO: 20 SEQ ID NO: 21 SEQ ID NO: 22 SEQ ID NO: 23 SEQ ID NO: 24 SEQ ID NO: 25 SEQ ID NO: 26 SEQ ID NO: 27 SEQ ID NO: 28

[0239] Inverse PCR from the most prevalent clone, KSa, yielded additional sequence for a total of 737 bp of contiguous DNA (SEQ ID NO:13). (The predicted amino acid product is presented in SEQ ID NO:14.) This sequence is similar to the 3′ end of the KS domain and an ‘intermodular’ region upstream of the acyl transferase (AT) domain when aligned with the pikAI KS domain from Streptomyces venezuelae (Xue et al., 1998) and DEBSI from Saccharopolyspora erythraea (Donadio et al., 1991). This sequence represents a single ORF as predicted for a PKS-I. In total, 3.2 kb of DNA has been cloned from adult and larval B. neritina DNA. All of the clones exhibit significant similarity to other PKS-I genes as illustrated by BLAST searches (Altschul et al., 1997). The GC content of the B. neritina clones is very low, ranging from about 32% to about 53%. This is lower than any other known bacterial PKS-I. This underscores the novelty of these genes. Due to the low GC content, it is unlikely that these PKS genes were obtained by lateral transfer from actinomycetes as suggested for other PKS genes, such as those from the myxobacterium Sorangium cellulosum (Schupp et al., 1995).

[0240] As SSU rRNA sequencing has shown, E. sertula is a γ-proteobacterium closely related to Pseudomonas fluorescens and P. syringae (Haygood and Davidson, 1997). However, the GC content of the Pseudomonas PKS genes is much higher, 63% for Pseudomonas fluorescens (Nowak-Thompson et al., 1997) and 67% for P. syringae (Rangaswamy et al., 1998). Thus, the PKS genes isolated here from B. neritina DNA (and presumably E. sertula) are unique, even among other closely related γ-proteobacterium. A genetic feature shared by the myxobacteria and actinomycetes is a high GC % content (67 to 71 and 69-73 mol %, respectively) (Seow et al., 1997; Schupp et al., 1995). Again, this reinforces the novel nature of the B. neritina/E. sertula derived PKS genes.

[0241] When the degenerate oligonucleotides were used in step down PCR with B. neritina DNA as a template, PCR products of approximately 300 bp were obtained using adult DNA from several locations and seasons. Single PCR products of approximately 300 bp were evident in amplifications from either adult or larval B. neritina DNA (containing both host bryozoan DNA and microbial symbiont DNA, presumably E. sertula). The consistent amplification of this product from both adult and larval B. neritina DNA suggests that it is not due to a sporadic contaminant. These PCR products are of the expected size based upon the location of the primers within the KS domains of other PKS-I genes.

[0242] Link Between the KSa Gene, B. neritina, and Bryostatins

[0243] Specific primers (SWA38R and SWA38L) were designed against variable regions from an abundant clone, KSa, isolated from the B. neritina larval library. These primers were designed to PCR amplify only this fragment, unlike the degenerate primers used to obtain the libraries, which amplify from any KS domain. Ten samples of DNA from adult colonies from a wide range of locations along the California coast from San Diego to Humboldt Bay, one adult sample from North Carolina, and two larval samples were screened for a KSa specific sequence by PCR amplification. All produced a strong band with the KSa specific primers (FIG. 6), lanes A-M). Two samples of other bryozoans collected together with B. neritina were negative (FIG. 6), lanes N, O). This result demonstrates that KSa is universal in B. neritina. These characteristics are those expected for a fragment derived from bryopyran synthase. KSb, which was independently isolated from adult and larval amplifications is another strong candidate as a sequence originating in bryopyran synthase. Based on the structure of other PKS genes the present inventors expect twelve KS domains in bryopyran synthase are expected.

[0244]B. neritina larvae were treated with gentamicin at about 100 micrograms per milliliter for about seven days post-settlement, grown out for 3 months to re-establish their commensal bacterial flora, and then assayed for E. sertula levels and KSa levels by specific PCR and bryostatin activity by phorbol dibutyrate displacement (PdBu) assay (FIG. 7). The standard method for detecting bryostatins is the PdBu displacement assay (DeVries et al., 1988). Rat brain liposomes are incubated with tritiated PdBu, which binds to PKC in the liposomes. The liposomes are collected by filtration and counted in a scintillation counter. If an extract is added that contains a compound that can compete with phorbol for binding, less tritiated PdBu is bound. The assay has a subnanomolar detection limit for pure bryostatin 1 and is suitable for milligram quantities of sample. In B. neritina, the only compounds with PdBu activity are bryostatins, and PdBu activity is a good measure of total bryostatin activity.

[0245] Both E. sertula population and KSa levels were reduced dramatically and similarly and bryostatin activity was also reduced compared to untreated controls (FIG. 7). Denaturing gradient gel electrophoresis (DGGE) (Muyzer et al., 1993) experiments showed that E. sertula is the only bacterium whose population is reduced in treated colonies, and growth data showed the bryozoan growth rates were the same in treated and untreated colonies. The concomitant reduction of KSa suggests that this gene resides in E. sertula and is involved in bryostatin synthesis.

Example 4 Cloning of a PKS Gene Cluster from Endobugula sertula

[0246] Preparation of High Molecular Weight DNA from Burgula neritina for Cosmid Cloning

[0247] Tips from fresh Bugula neritina contain the symbiont E. sertula and were excised with scissors, blotted dry, and frozen at −80° C. in 8 gram aliquots. DNA was extracted by pulverizing small portions of one aliquot with a mortar and pestle chilled with dry ice, to keep the tissue frozen. After all of the tissue was pulverized it was added to 24 milliliters of lysis buffer (50 mM Tris, pH 8.0, 50 mM Na₂EDTA, 350 mM NaCl, 2% Na sarcosyl, 8 M urea) in a 50 milliliters screw cap tube. The sample was incubated for 5 minutes at room temperature, and then 10 milliliters of phenol:chloroform (1:1) was added. The tube was placed on a rotator and the aqueous and organic layers mixed for 40 minutes at 20 rpm. The tube was then centrifuged in a table top centrifuge for 5 minutes at maximum speed, and the upper layer gently transferred with a wide-bore pipet to a new tube. Ten milliliters of phenol:chloroform was added and the sample rotated for 25 minutes at 20 rpm. After centrifugation, the upper layer was transferred to a new tube, 10 milliliters of phenol:chloroform added, and the tube rotated for 20 minutes at 20 rpm. After centrifugation, the upper layer was removed, divided in half, and each half ethanol precipitated by adding one-tenth volume of 3 molar Na acetate, and 2 volumes of ethanol. The tubes were then centrifuged in a Sorval RC5B centrifuge for 10 minutes at 10,000 rpm in an HB4 rotor, and the pellets were washed twice with 70% ethanol. One-half milliliter of TE buffer (10 mM Tris-HCl, pH 8.0, 1 mM Na₂EDTA) was added to each tube and the tubes were incubated with gentle shaking at 18° C. overnight. Pipeting was done with a cut-off pipetman tip. DNA was stored at 4° C.

[0248] Partial Digestion and Sucrose Gradient Fractionation of B. neritina DNA

[0249]B. neritina DNA was then partially digested with Sau3AI for cloning by the following protocol. Pilot experiments showed that between 0.04-0.08 units of enzyme per microliter were required in a reaction containing 0.1 microgram per microliter DNA, incubated for 1 hour at 37° C. Large scale digests were set up using 100 micrograms of DNA and 0.04, 0.06, and 0.08 units of enzyme per microliter, in 1 milliliter total volume, incubated at 37° C. for 1 hour. Forty microliters of 500 millimolar Na₂EDTA was added to each tube, and samples heated to 70° C. for 15 min. After cooling, aliquots were run on a 0.5% agarose gel to check the extent of digestion. All three treatments gave fragments within the desired size range of 30 to 50 kbp. Samples were phenol:chloroform extracted twice, ethanol precipitated, washed twice in 70% ethanol and resuspended in 200 microliters TE buffer (10 mM Tris-HCl, 1 mM Na₂EDTA). Samples were loaded on 10-40% sucrose gradients (made in 10 mM Tris-HCl, pH 8.0, 10 mM NaCl, 1 mM Na₂EDTA), and centrifuged in a SW41 rotor in a Beckman LC8 centrifuge for 22 hours at 22,000 rpm at 20° C. After completion, gradients were fractionated by removing 350 microliter aliquots from the top with a pipetman. Fractions were analyzed by running aliquots on an agarose gel. Appropriate fractions were precipitated by adding 150 microliters TE, 6 micrograms tRNA as carrier, 50 microliters 3 molar Na acetate, and 1 milliliter ethanol. Tubes were placed on ice 45 minutes, centrifuged 30 minutes at 10,000 rpm in an HB4 rotor, washed twice with 70% ethanol, and resuspended in 20 microliters water.

[0250] Competitive PCR of B. neritina DNA

[0251] To determine whether the fractionated, partially-digested DNA preparation from B. neritina, had enough representation of E. sertula DNA for cloning, competitive PCR was performed, using a clone containing a deletion in KSa. An unfractionated preparation of DNA previously obtained by an alternate method was used as a control. Fraction number 18 from the 0.06 unit/microliter Sau3AI digest was used as representative of the B. neritina DNA preparation of the more gently isolated DNA of this example. The results are shown in FIG. 8. Bacterial DNA in the fractionated DNA preparation amplified with equal intensity to the competitor at a 10-fold higher level of competitor and is indicative of bacterial DNA present at 10-fold higher levels in the DNA preparation of this example compared with the previous isolated preparation. Based on estimations of bacterial and host genome size, there were about 6 bacterial genome equivalents per host genome equivalent. From this information it was determined that 14,000 cosmid clones were required to ensure a 95% probability of representation of a given gene.

[0252] Ligation and Packaging of DNA

[0253] Three sucrose gradient fractions (no. 19, 21, and 23) from the 0.06 unit/microliter Sau3AI digest were used for ligations consisting of 18 microliter insert DNA (approx. 2 micrograms), 1 microliter (1 microgram) of SuperCos vector (Statagene, Inc., La Jolla, Calif.), that had been digested with BamHI and phosphatased, and 5 units of T4 DNA ligase. After overnight incubation at room temperature, a portion was removed to check the efficiency of ligation on an agarose gel, and the remainder was ethanol precipitated, washed in 70% ethanol and resuspended in 20 microliters TE. Two microliters were used for packaging using the GigaPack Gold III kit (Stratagene, Inc., La Jolla, Calif.). In a screen of 12 colonies it was determined that 83% of clones had cosmid-sized inserts. Packaging was repeated with four more aliquots, with a final estimate of 40,000 colony forming units total. After plating all of the packaging mixes about 10,000 colonies were present and screened for the presence of KSa.

[0254] Colony Screening and Selection of Positive Clones

[0255] Colony filters were screened by standard procedures, using a ³²P-labeled probe derived from the cloned KSa fragment. Eleven colonies appeared positive and were picked onto another plate. These were screened by PCR using the KSa-specific primers, and four clones designated 2A, 3A, 4A, and 6A, gave positive amplification (FIG. 9). These were grown for midi-preparation of cosmid DNA.

[0256] DNA Preparation and Analysis of KSa-hybridizing Cosmid Clones

[0257] DNA preparations for cosmid clones were done by growing 100 milliliters culture in Luria Broth medium, and using a Midi-Prep kit (Qiagen, Inc., Valencia, Calif.) for isolation. EcoRI restriction digests of the four clones showed that 2A, 3A, and 6A appeared to have common fragments (FIG. 10), suggesting that the clones overlapped. Clone 4A had apparently lost its insert and repeated attempts to isolate this clone were unsuccessful. Clone 2A contained fragments with different stoichiometries, suggesting that two colonies were picked in the original isolation or the clone was deleting a segment. Further attempts to isolate a full length copy of 2A were unsuccessful. Clones 3A and 6A had high yields of DNA and appeared normal and were further characterized.

[0258] Sequencing and Restriction Mapping of Clones 3A and 6A

[0259] All sequencing reactions on cosmid clones were done using a Dye Terminator Cycle Sequencing Kit with AmpliTaq® DNA Polymerase FS (Applied Biosystems Inc., Foster City, Calif.), and reactions run on an ABI 373A sequencer. Initial sequencing reactions on clones 3A and 6A were done using T7 and T3 primers, and the KSa-specific primers. Results confirmed that the KSa region was present in these clones, and that the T3 ends of 3A and 6A had homology to PKS genes. Restriction maps were generated by identifying common fragments in overlapping regions of clones, in combination with hybridization analysis with the KSa probe, and oligonucleotide probes hybridizing to flanking sequences in the SuperCos vector. After thorough mapping and sequencing of the clones, it was determined that PKS homology was located only near the T3 ends of both clones. Since a PKS cluster with the capability to synthesize bryostatin could be as large as 60 kbp, overlapping clones extending beyond the T3 end of 6A were required.

[0260] Isolation of Cosmid Clones 5A and 5B

[0261] A probe was derived from a restriction fragment at the T3 end of 6A and used to rescreen the cosmid library. Twenty-seven additional colonies were isolated. After characterization by restriction digests and hybridization using the KSa probe, clones 5A and 5B were shown to contain multiple EcoRI/SalI fragments with KSa homology (FIG. 11). An overall map for clones 3A, 6A, 5A, and 5B are shown in (FIG. 12). DNA preparations of clones 5A and 5B were sequenced from the T3 and T7 ends. For clone 5A, PKS homology was identified at both ends, and the predicted direction of transcription from the T7 to the T3 end suggested that the entire insert (approx. 35 kbp) contained PKS homology. For clone 5B, PKS homology was identified at the T7 end, and homology to glutathione reductase at the T3 end.

[0262] Subcloning and Sequencing of Cosmid Clones

[0263] Detailed maps of the cosmid clones and regions sequenced are presented in FIG. 13. Clone 3A was sequenced from the T3 end, and from both directions in the KSa region by designing oligonucleotide primers to extend existing sequence until sequences overlapped (FIG. 14). All primers for this project were from Integrated DNA Technologies, Inc. (Coralville, Iowa). The putative start of the PKS cluster was identified approximately 5.5 kbp from the T3 end of 3A, and this was preceded by an open reading frame with homology to a transposase. No Shine-Delgarno sequence is observed upstream of the putative start site whereas a possible Shine-Delgarno sequence appears downstream of the ATG site and may be indicative of the start site being several codons further downstream of the putative start. The region between the transposase and the PKS cluster is likely to contain control elements for transcription of the cluster and possible elements are identified [SEQ ID NO:29].

[0264] The portion of clone 6A downstream from 3A was sequenced, and upstream from its T3 end. The sequencing strategy of this region is presented in FIG. 15 with the sequence of a small gap yet to be determined and therefore the sequence of this region of clone 6A is presented as two contigs, No. 2 and No. 5 (FIG. 13, SEQ ID NO:30 and SEQ ID NO:31 respectively).

[0265] PstI fragments of clones 5A and 5B (FIG. 13) were subcloned into a pBluescript vector and sequenced. Each fragment in 5B was sequenced from its T3 and T7 ends, and fragments in 5A that did not overlap with these were sequenced similarly. From this sequence, primers were designed to sequence in the opposite direction on the cosmid template to determine which restriction fragments were adjacent to each other. In 5A, it was determined that PstI fragments A2, F4, and C2 overlapped (FIG. 13 and FIG. 16, and SEQ ID NO:32). These sequences have not been precisely located on the map but they are believed to be in the general area as shown in (FIG. 13).

[0266] For clone 5B, PstI fragments A4 and B1 overlap, as well as PstI fragments D4 and C1, C1 and E1, E1 and A3, and A3 and A7 (FIG. 13 and FIG. 17). Overlapping sequences are presented for the PstI subcloned fragments of clone 5B; PstA4/B1 (SEQ ID NO:33), PstD4/C1 (SEQ ID NO:34), PstC1/E1 (SEQ ID NO:35), and PstE1/A3 (SEQ ID NO:36). Overlapping sequences between PstI fragments B 1 and D4 could not be identified, and with subsequent results, suggests that a portion of the cosmid clones may have been deleted in this region. The entire sequence of 5B PstA7 has been determined on at least one strand (FIG. 13 and FIG. 18; SEQ ID NO:37) and the end of the PKS cluster has been identified in this fragment, approximately 3,200 bp from the end of A7 nearest the T3 end of 5A.

[0267] The locations of all regions where PKS homology was identified by sequence analysis are presented in (FIG. 13). PKS homology extends throughout the region sequenced, and our current estimate of the total length of the cluster is 52-56 kbp, large enough to encode enzymes to synthesize bryostatin. To date approximately 38,000 bp of unique sequences have been identified in this region. TABLE IV Sequences of clones B. neritina and predicted amino acid product of SEQ ID NO: 29 Description SEQ ID NO Nucleotide sequence of PKS cluster on clone 3A SEQ ID NO: 29 Contig 2 sequences from cosmid 6A SEQ ID NO: 30 Contig 5 sequences from cosmid 6A SEQ ID NO: 31 Cosmid clone 5A Pst A2/F4/C2 overlap sequence SEQ ID NO: 32 Cosmid clone 5B Pst A4/B1 overlap sequence SEQ ID NO: 33 Cosmid clone 5B Pst D4/C1 overlap sequence SEQ ID NO: 34 Cosmid clone 5B Pst C1/E1 overlap sequence SEQ ID NO: 35 Cosmid clone 5B Pst E1/A3 overlap sequence SEQ ID NO: 36 Cosmid clone 5B Pst a7/5A T7 sequence SEQ ID NO: 37 Predicted amino acid product of SEQ ID NO: 29 SEQ ID NO: 38

Example 5 Combinatorial Biosynthesis

[0268] Combinatorial biosynthesis has generally been used in the search for novel molecules with applications as pharmaceuticals or as platforms for combinatorial synthesis. For example, Shen et al. have demonstrated that engineered aromatic or modular PKSs can be used to generate polyketide libraries of different molecular sizes and shapes (Shen et al., 1999). The biosynthetic genes of the present invention, for example for bryostatin synthesis, could be incorporated into these systems to create derivatives/analogs of bryostatins with improved properties such as reduced toxicity/myalgia, greater efficacy etc. (Shen et al., 1999; Xue et al., 1998). Recently, the erythromycin PKS genes have been engineered to effect combinatorial alterations of catalytic activities in the biosynthetic pathway (McDaniel et al., 1999). This has resulted in the successful generation of more than fifty macrolides which would otherwise be impractical to produce through chemical methods. This leads to the creation of libraries of novel “unnatural” natural products exhibiting altered functions (McDaniel et al., 1999). The bryostatin PKS genes could be used in such a system to create analogs of bryostatin with improved properties.

[0269] The cloned biosynthetic genes presented here have applications in bioprospecting. The cloned PKS genes could be used in PCR, in situ hybridizations, etc to isolate novel marine (and terrestrial) PKS and polyketides which may exhibit novel structures and novel activities (antibacterial, antifungal, anticancer, etc.)

[0270] Due to the novel GC content of the biosynthetic clones presented, these clones have application in screening molecular diversity from environmental DNA samples to identify novel PKS genes on the basis of this low GC content. In high stringency hybrizations, these very low GC PKS gene fragments could be used as probes to detect novel PKS's from environmental isolates, heretofore unknown since the current approach in the field is to use Streptomyces PKS genes as probes which are high GC and would possibly not hybridize to our gene (s) (and other members of this proposed gene family) due to its lower GC content (Schupp et al., 1995). The clones presented herein thus represent a novel class of low GC content PKSs. Precedent exists for expression of PKS in nonpolyketide-producing prokaryotic and eukaryotic hosts (Kealey et al., 1998).

Example 6 PKS Genes from B. Pacifica

[0271] DNA Extraction

[0272] Adult B. pacifica DNA was extracted using a QIAamp Tissue Kit (Qiagen Inc., Valencia, Calif.) as described for the extraction of the B. neritina larval DNA.

[0273] PCR Conditions for the Amplification of SSU rRNA from B. pacifica

[0274] The PCR conditions for the initial amplification of the B. pacifica ribosomal small-subunit (SSU) rRNA gene sequences were as follows. E. sertula specific 16S rRNA PCR primers were used as previously described (Haygood, 1997). A total reaction volume of 50 microliters contained approximately 200 ng of adult B. pacifica DNA, 1 millimolar each 16s SSU rRNA primer (198F and 1253R), and Taq polymerase and buffer (Boehringer Mannheim Corp., Indianapolis, Ind.). PCR conditions for the SSU rRNA primers were as follows: denaturation (94 C; 1 min), annealing (54 C; 1 min), and extension (72 C; 1 min) for 30 cycles.

[0275] Conditions for the PCR amplification of KS DNA from B. pacifica (using the degenerate primers PKSR and BLCASPKS) were identical to those used for B. nertina. Primer Designation and Orientation Sequence SEQ ID NO: 240F (forward) TGC TAT TTG ATG AGC CCG CGT T SEQ ID NO: 7 1253R CAT CGC TGC TTC GCA ACC C SEQ ID (reverse) NO: 8

[0276] When B. pacifica adult DNA was PCR amplified with the E. sertula specific 16S rRNA primers, no strong band was obtained. This result shows that the priming sites in the variable regions of the 16S rRNA genes of the larval symbiont of B. pacifica differ from those of E. sertula. However, an extremely faint band was present and was cloned. The sequence of the clone is distinct from, but closely related to, E. sertula, the symbiont of B. neritina. The two E. sertula strains (Davidson and Haygood, 1999) differ by 0.6%, the B. pacifica symbiont differs from E. sertula by about 5%, comparable to a species difference within a bacterial genus. This result was confirmed by DGGE of both B. neritina and B. pacifica larval DNA. No band was corresponding exactly to the E. sertula band of B. neritina was seen in the B. pacifica sample, but several other candidate bands were observed (FIG. 19).

[0277] Chemistry

[0278]B. pacifica extracts were made generally following reported methods (see, for example, Schaufelberger et al., J. Nat. Products, 54:1265-1270 (1991)). These extracts were separated using HPLC, which provided profiles different from that of B. neritina. Five peaks occurred in the region where bryostatins appear; one of the peaks (9.623 min RT) had a retention time and UV absorbance maxima similar to one of the minor bryostatins; the others did not (FIG. 20). These data indicate that molecules with chemical properties similar to the bryostatins may be present and most of them differ significantly from known bryostatins.

[0279] Activity

[0280] PdBu assay of an extract of B. pacifica colonies showed significant protein kinase C binding activity (DeVries et al., 1988) (FIG. 21).

[0281] PKS-I genes

[0282] PCR amplification of adult B. pacifica DNA with the degenerate KS primers (PKSR and BLCASPKS) yielded a strong band of approximately 300 bp, similar in size to that obtained from B. neritina DNA. Because the product is a mixture of sequences from different KS domains from different modules of the PKS-I, the band was cloned. Two clones have been sequenced and their deduced amino acid sequence identical to KS clones KSb and KSc from B. neritina (FIG. 24).

[0283] All publications, including patent documents and scientific articles, referred to in this application, including any bibliography, are incorporated by reference in their entirety for all purposes to the same extent as if each individual publication were individually incorporated by reference.

[0284] All headings are for the convenience of the reader and should not be used to limit the meaning of the text that follows the heading, unless so specified.

REFERENCES

[0285] U.S. Pat. No. 3,960,897 to Story et al., issued Jun. 1, 1976.

[0286] U.S. Pat. No. 4,560,774 to Pettit et al., issued Dec. 24, 1985.

[0287] U.S. Pat. No. 4,611,066 to Pettit et al., issued Sep. 9, 1986.

[0288] U.S. Pat. No. 4,833,257 to Pettit et al., issued May 23, 1989.

[0289] U.S. Pat. No. 4,940,726 to Pettit et al., issued Jul. 10, 1990.

[0290] U.S. Pat. No. 5,072,004 to Pettit, issued Dec. 10, 1991.

[0291] U.S. Pat. No. 5,358,711 to May et al., issued Oct. 25, 1994.

[0292] U.S. Pat. No. 5,521,077 to Khosla et al., issued May 28, 1996.

[0293] U.S. Pat. No. 5,652,125 to Scotti et al., issued Jul. 29, 1997.

[0294] U.S. Pat. No. 5,672,491 to Khosla et al., issued Sep. 30, 1997.

[0295] U.S. Pat. No. 5,712,146 to Khosla et al., issued Jan. 27, 1998.

[0296] U.S. Pat. No. 5,716,849 to Ligon et al., issued Feb. 10, 1998.

[0297] U.S. Pat. No. 5,716,968 to Driedger et al. issued Feb. 10, 1998.

[0298] U.S. Pat. No. 5,734,038 to Au-Young et al. issued Mar. 31, 1998.

[0299] U.S. Pat. No. 5,744,350 to Vinci et al., issued Apr. 28, 1998.

[0300] U.S. Pat. No. 5,750,709 to Castor, issued May 12, 1998.

[0301] U.S. Pat. No. 5,776,486 to Castor et al. issued Jul. 7, 1998.

[0302] U.S. Pat. No. 5,783,431 to Peterson et al., issued Jul. 21, 1998.

[0303] U.S. Pat. No. 5,804,433 to Gray et al., issued Sep. 8, 1998.

[0304] U.S. Pat. No. 5,811,285 to Gray et al., issued Sep. 22, 1998.

[0305] U.S. Pat. No. 5,824,485 to Thompson et al., issued Oct. 20, 1998.

[0306] U.S. Pat. No. 5,824,513 to Katz et al., issued Oct. 20, 1998.

[0307] U.S. Pat. No. 5,830,750 to Khosla et al., issued Nov. 3, 1998.

[0308] U.S. Pat. No. 5,840,324 to Hennessy et al., issued Nov. 24, 1998.

[0309] U.S. Pat. No. 5,843,718 to Khosla et al., issued Dec. 1, 1998.

[0310] U.S. Pat. No. 5,849,541 to Vinci et al., issued Dec. 15, 1998.

[0311] U.S. Pat. No. 5,876,991 to DeHoff et al., issued Mar. 2, 1999.

[0312] WO 93/13663 to Katz et al., published Jul. 22, 1993.

[0313] WO 95/12661 to Vinci et al., published May 11, 1995.

[0314] WO 97/02358 to Khosla et al., published Jan. 23, 1997.

[0315] WO 97/11709 to Piper, published Apr. 3, 1997.

[0316] WO 97/22711 to Sherman et al., published Jun. 26, 1997.

[0317] WO 97/34598 to Blumberg et al., published Sep. 25, 1997.

[0318] WO 97/47294 to Nobuhiro et al., published Dec. 18, 1997.

[0319] WO 98/01546 to Leadlay et al., published Jan. 15, 1998.

[0320] WO 98/11230 to Oki et al., published Mar. 19, 1998.

[0321] WO 98/17811 to Peterson et al., published Apr. 30, 1998.

[0322] WO 98/20042 to Wei et al., published May 14, 1998.

[0323] WO 98/27203 to Barr et al., published Jun. 25, 1998.

[0324] WO 98/27817 to Sirnyan et al., published Jul. 2, 1998.

[0325] WO 98/41869 to Nasby et al., published Sep. 24, 1998.

[0326] WO 98/48813 to Prichard et al., published Nov. 5, 1998.

[0327] WO 98/49294 to Prichard et al., published Nov. 5, 1998.

[0328] WO 98/49315 to Khosla et al., published Nov. 5, 1998.

[0329] WO 98/53097 to Waters et al., published Nov. 26, 1998.

[0330] WO 98/58063 to Hillman et al., published Dec. 23, 1998.

[0331] WO 99/02669 to Betlach, published Jan. 21, 1999.

[0332] EP 0791655A2 to Dehoff, published Aug. 27, 1997.

[0333] EP 0791656A2 to Burgett, published Aug. 27, 1997.

[0334] Altschul, S. F., T. L. Madden, A. A. Schaffer, J. Zhang, Z. Zhang, W. Miller and D. J. Lipman. (1997). Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res., 25: 3389-3402.

[0335] Baldwin, N. G., C. D. Rice, T. M. Tuttle, H. D. Bear, J. I. Hirsch and R. E. Merchant. (1997). Ex vivo expansion of tumor-draining lymph node cells using compounds which activate intracellular signal transduction. I. Characterization and in vivo anti-tumor activity of glioma-sensitized lymphocytes. J. Neurooncol., 32: 19-28.

[0336] Basu, A. (1998). The involvement of novel protein kinase C isozymes in influencing sensitivity of breast cancer MCF-7 cells to tumor necrosis factor-alpha. Molec. Pharm., 53: 105-111.

[0337] Bewley, C. A., N. A. Holland and D. J. Faulkner. (1996). Two classes of metabolites from Theonella swinhoei are localized in distinct populations of bacterial symbionts. Experientia, 52: 716-722.

[0338] Caponigro, F., R. C. French and S. B. Kaye. (1997). Protein kinase C: a worthwhile target for anticancer drugs? Anticancer Drugs, 8: 26-33.

[0339] Caspi, R., B. M. Tebo and M. G. Haygood. (1998). c-Type cytochromes and manganese oxidation in Pseudomonas putida MnB 1. Applied and Envir. Micro., 64: 3549-3555.

[0340] Correale, P., M. Caraglia, A. Fabbrocini, R. Guarrasi, S. Pepe, V. Patella, G. Marone, A. Pinto, A. R. Bianco and P. Tagliaferri. (1995). Bryostatin 1 enhances lymphokine activated killer sensitivity and modulates the beta 1 integrin profile of cultured human tumor cells. Anticancer Drugs, 6: 285-90.

[0341] DeVries, D. J., C. L. Herald, G. R. Pettit and P. M. Blumberg. (1988). Demonstration of sub-nanomolar affinity of bryostatin 1 for the phorbol ester receptor in rat brain. Biochem. Pharmacol., 37: 4068-4073.

[0342] Dickson, R. B., M. D. Johnson, M. Maemura and J. Low. (1996). Anti-invasion drugs. Breast Cancer Res. Treat., 38: 121-132.

[0343] Donadio, S., M. J. Staver, J. B. McAlpine, S. J. Swanson and L. Katz. (1991). Modular organization of genes required for complex polyketide biosynthesis. Science, 759: 675-679.

[0344] Fine, R., J. Patel and B. Chabner. (1988). Phorbol esters induce MDR in human breast cancer cells. Proc. Natl. Acad. Sci. USA, 85: 582-586.

[0345] Fleming, M. D., H. D. Bear, K. Lipshy, P. J. Kostuchenko, D. Portocarero, A. W. McFadden and S. K. Barrett. (1995). Adoptive transfer of bryostatin-activated tumor-sensitized lymphocytes prevents or destroys tumor metastases without expansion in vitro. J. Immunother. Emphasis Tumor Immunol., 18: 147-55.

[0346] Fornwald, J. A., M. J. Donovan, R. Gerber, J. Keller, J. P. Taylor, E. J. Arcuri and M. E. Brawner. (1993). Soluble forms of the human T-cell receptor CD4 are efficiently expressed by Strepomyces lividans. Bio-Technology, 11: 1031-1036.

[0347] Grant, S., R. Traylor, G. R. Pettit and P. S. Lin. (1994). The macrocyclic lactone protein kinase C activator, bryostatin 1, either alone, or in conjunction with recombinant murine granulocyte-macrophage colony-stimulating factor, protects Balb/c and C3H/HeN mice from the lethal in vivo effects of ionizing radiation. Blood, 83: 663-667.

[0348] Haygood, M. G. and S. K. Davidson. (1997). Small subunit ribosomal RNA genes and in situ hybridization of the bacterial symbionts in the larvae of the bryozoan Bugula neritina and proposal of “Candidatus Endobugula sertula”. Appl. Env. Microbiol., 63: 4612-4616.

[0349] Haygood and Davidson (1998). Bacterial symbionts of the byostatin-producing bryozoan Bugula neritina, New. Develop. In Marine Biotechnology, Ed. Le Gal and Halvorson, Plenum Press, New York, pp.281-284.

[0350] Haygood (1999), Save the Bugula neritina! It may help beat cancer. San Diego Union-Tribune, Apr. 14, 1999, Current and Arts Section.

[0351] Jayson, G. C., D. Crowther, J. Prendiville, A. T. McGown, C. Scheid, P. Stern, R. Young, P. Brenchley, J. Chang, S. Owens and et al. (1995). A phase I trial of bryostatin 1 in patients with advanced malignancy using a 24 hour intravenous infusion. Br J Cancer, 72: 461-8.

[0352] Johnson, M. D., J. A. Torri, M. E. Lippman and R. B. Dickson. (1999). Regulation of motility and protease expression in PKC-mediated induction of MCF-7 breast cancer cell invasiveness. Experimental Cell Res., 247: 105-113.

[0353] Kageyama, M., T. Tamura, M. H. Nantz, J. C. Roberts, P. Somfai, D. C. Whritenour and S. Masamune. (1990). Synthesis of bryostatin-7. Journal Of the American Chemical Society, 112: 7407-7408.

[0354] Kealey, J. T., L. Liu, D. V. Santi, M. C. Betlach and P. Barr. (1998). Production of a polyketide natural product in nonpolyketide producing prokaryotic and eukaryotic hosts. Proc. Natl. Acad. Sci. USA, 95: 505-509.

[0355] Korczak, B., C. Whale and R. S. Kerbel. (1989). Possible involvement of Ca++ mobilization and protein kinase C activation in the induction of spontaneous metastasis by mouse mammary adenocarcinoma cells. Cancer Res., 49: 2597-2602.

[0356] Kraft, A. S. (1993). Bryostatin 1: will the oceans provide a cancer cure? [editorial; comment]. J. Natl. Cancer Inst., 85: 1790-2.

[0357] Kraft, A. S., S. Woodley, G. R. Pettit, F. Gao, J. C. Coll and F. Wagner. (1996). Comparison of the antitumor activity of bryostatins 1, 5, and 8. Cancer Chemother. Pharmacol., 37: 271-8.

[0358] Lind, D. S., T. M. Tuttle, K. P. Bethke, J. L. Frank, C. W. McCrady and H. D. Bear. (1993). Expansion and tumour specific cytokine secretion of bryostatin-activated T-cells from cryopreserved axillary lymph nodes of breast cancer patients. Surg. Oncol., 2: 273-82.

[0359] Liotta, L. A. and E. Kohn. (1990). Cancer Invasion and Metastases. JAMA, 263: 1123-1126.

[0360] Lipshy, K. A., P. J. Kostuchenko, G. G. Hamad, C. E. Bland, S. K. Barrett and H. D. Bear. (1997). Sensitizing T-lymphocytes for adoptive immunotherapy by vaccination with wild-type or cytokine gene-transduced melanoma. Ann. Surg. Oncol., 4: 334-41.

[0361] Maloy, S. R., V. J. Stewart and R. K. Taylor. (1996). Genetic Analysis of Pathogenic Bacteria. Plainview, N.Y., Cold Spring Harbor Laboratory Press.

[0362] McDaniel, R., A. Thamchaipenet, C. Gustafsson, H. Fu, M. Betlach, M. Betlach and G. Ashley. (1999). Multiple genetic modifications of the erythromycin polyketide synthase to produce a library of novel “unnatural” natural products. Proc. Natl. Acad. Sci. USA, 96: 1846-1851.

[0363] Muyzer, G., E. C. De Waal and A. G. Uitterlinden. (1993). Profiling of complex microbial populations by denaturing gradient gel electrophoresis analysis of polymerase chain reaction-amplified genes coding for 16S rRNA. Appl. Environ. Microbiol., 59: 695-700.

[0364] Nowak-Thompson, B., S. J. Gould and J. E. Loper. (1997). Identification and sequence analysis of the genes encoding a polyketide synthase required for pyoluteorin biosynthesis in Pseudomonas fluorescens Pf-5. Gene, 204: 17-24.

[0365] Ochman, H., M. M. Medhora, D. Garza and D. L. Hartl. (1990). Amplification of flanking sequences by inverse PCR. PCR protocols—a guide to methods and applications. San Diego, Calif., Academic Press.

[0366] Pettit, G. R. (1991). The bryostatins. Prog. Chem. Org. Nat. Prod., 57: 153-195.

[0367] Pettit, G. R., F. Gao, P. M. Blumberg, C. L. Herald, J. C. Coll, Y. Kamano, N. E. Lewin, J. M. Schmidt and J. C. Chapuis. (1996). Antineoplastic agents. 340. Isolation and structural elucidation of bryostatins 16-18. Journal of Natural Products, 59: 286-9.

[0368] Pettit, G. R., F. Gao, D. Sengupta, J. C. Coll, C. L. Herald, D. L. Doubek, J. M. Schmidt, J. R. Van Camp, J. J. Rudloe and R. A. Nieman. (1991). Isolation and structure of bryostatins 14 and 15. Tetrahedron Lett., 22: 3601-3610.

[0369] Pettit, G. R., C. L. Herald, D. L. Doubek and D. L. Herald. (1982). Isolation and structure of bryostatin 1. J. Am. Chem. Soc., 104: 6846-6847.

[0370] Philip, P. A., Y. Li, M. Alonso, H. Manji, F. Sarkar, J. Ensley and S. Ali-Sadati. (1999). Sensitization of human breast cancer cells to gemcitabine by bryostatin 1. Workshop In: P. o. t. A. A. o. C. R. A. Meeting (ed.), 90th Annual Meeting of the American Association for Cancer Research Philadelphia, Pa.

[0371] Pluda, J. M., B. D. Cheson and P. H. Phillips. (1996). Clinical trials referral resource. Clinical trials using bryostatin-1. Oncology (Huntingt), 10: 740-2.

[0372] Rangaswamy, V., S. Jiralerspong, R. Parry and C. L. Bender. (1998). Biosynthesis of the Pseudomonas polyketide coronafacid acid requires monofunctional and multifunctional polyketide synthase proteins. Proc. Natl. Acad. Sci. USA, 95: 15469-15474.

[0373] Rondon, M. R., S. J. Raffel, R. M. Goodman and J. Handelsman. (1999). Toward functional genomics in bacteria: Analysis of gene expression in Escherichia coli from a bacterial artificial chromosome library of Bacillus cereus. Proc. Natl. Acad. Sci. USA, 96: 6451-6455.

[0374] Sambrook, J., E. F. Fritsch and T. Maniatis. (1989). Molecular Cloning: A Laboratory Manual. Plainview, N.Y., Cold Spring Harbor Lab. Press.

[0375] Schaufelberger, D. E., M. P. Koleck, J. A. Beutler, A. M. Vatakis, A. B. Alvarado, P. Andrews, L. V. Marzo, G. M. Muschik, J. Roach, J. T. Ross, W. B. Lebherz, M. P. Reeves, R. M. Eberwein, L. L. Rodgers, R. P. Testerman, K. M. Snader and S. Forenza. (1991). The large-scale isolation of bryostatin 1 from Bugula neritina following current good manufacturing practices. J. Nat. Products, 54: 1265-1270.

[0376] Scheid, C., J. Prendiville, G. Jayson, D. Crowther, B. Fox, G. R. Pettit and P. L. Stern. (1994). Immunomodulation in patients receiving intravenous Bryostatin 1 in a phase I clinical study: comparison with effects of Bryostatin 1 on lymphocyte function in vitro. Cancer Immunol. Immunother., 39: 223-30.

[0377] Schupp, T., C. Toupet, B. Cluzel, S. Neff, S. Hill, J. J. Beck and J. M. Ligon. (1995). A Sorangium cellulosum (Myxobacterium) gene cluster for the biosynthesis of the macrolide antibiotic Soraphen A: cloning, characterization, and homology to polyketide synthase genes from actinomycetes. J. Bacteriol., 177: 3673-3679.

[0378] Seow, K. -T., G. Meurer, M. Gerlitz, E. Wendt-Pienkowski, R. Hutchinson and J. Davies. (1997). A study of iterative type II polyketide synthases, using bacterial genes cloned from soil DNA: a means to access and use genes from uncultured microorganisms. J. Bacteriol., 179: 7360-7368.

[0379] Shen, Y., P. Yoon, T. -W. Yu, G. G. Floss, D. Hopwood and B. S. Moore. (1999). Ectopic expression of the minimal whiE polyketide synthase generates a library of aromatic polyketides of diverse sizes and shapes. Proc. Natl. Acad. Sci. USA, 96: 3622-3627.

[0380] Shure, M., S. Wessler and N. Fedoroff. (1983). Molecular identification and isolation of the waxy locus in maize. Cell, 35: 225-233.

[0381] Steube, K. G. and H. G. Drexler. (1993). Differentiation and growth modulation of myeloid leukemia cells by the protein kinase C activating agent bryostatin-1. Leuk. Lymphoma, 9: 141-8.

[0382] Sung, S. J., P. S. Lin, R. Schmidt-Ullrich, C. E. Hall, J. A. Walters, C. McCrady and S. Grant. (1994). Effects of the protein kinase C stimulant bryostatin 1 on the proliferation and colony formation of irradiated human T-lymphocytes. Int. J. Radiat. Biol., 66: 775-83.

[0383] Taylor, L. S., G. W. Cox, G. Melillo, M. C. Bosco and I. Espinoza-Delgado. (1997). Bryostatin-1 and IFN-gamma synergize for the expression of the inducible nitric oxide synthase gene and for nitric oxide production in murine macrophages. Cancer Res, 57: 2468-73.

[0384] Ways, D. K., C. A. Kukoly, J. de Vente, J. L. Hooker, W. O. Bryant, K. J. Posekany, D. J. Fletcher, P. P. Cook and P. J. Parker. (1995). MCF-7 Breast Cancer Cells Transfected with Protein Kinase C-alpha Exhibit Altered Expression of Other Protein Kinase C Isoforms and Display a More Aggressive Neoplastic Phenotype. J. Clin. Invest., 95:

[0385] Wender, P. A., J. Debrabander, P. G. Harran, J. M. Jimenez, M. F. T. Koehler, B. Lippa, C. M. Park, C. Siedenbiedel and G. R. Pettit. (1998). The design, computer modeling, solution structure, and biological evaluation of synthetic analogs of bryostatin 1. Proc. Natl. Acad. Sci. USA, 95: 6624-6629.

[0386] Wender et al., Synthesis and biological evaluation of fully synthetic bryostatin analogues, Tetrahedron Letters 39:8625-8628 (1998).

[0387] Xue, Y., L. Zhao, H. -w. Liu and D. H. Sherman. (1998). A gene cluster for macrolide antibiotic biosynthesis in Streptomyces venezuelae: Architecture of Metabolic Diversity. Proc. Natl. Acad. Sci. USA, 95: 12111-12116.

1 38 1 17 DNA Endobugula sertula misc_feature (1)...(17) N in this sequence refers to I or inosine. 1 acrtgngcrt tngtncc 17 2 15 DNA Endobugula sertula misc_feature (1)...(15) N in this sequence refers to I or inosine. 2 ncayggnacn ggnac 15 3 18 DNA Endobugula sertula 3 acggacaagc gtcattac 18 4 18 DNA Endobugula sertula 4 acggacaagc gtcattac 18 5 29 DNA Endobugula sertula 5 gttgtctttg cagcatcgca tgttaccac 29 6 25 DNA Endobugula sertula 6 cacgcccgct atcccagcac ctacc 25 7 22 DNA Endobugula sertula 7 tgctatttga tgagcccgcg tt 22 8 19 DNA Endobugula sertula 8 catcgctgct tcgcaaccc 19 9 315 DNA Endobugula sertula 9 aaattgggtg atccgataga agtcgagaca ttggcagaat cgtttcgagt ctatacggac 60 aagcgtcatt actgtgctct ggggtcggta aaaagtaata ttggtcattt gggggtaggt 120 gctgggatag cgggcgtgac caaagtattg ttgtctttgc agcatcgcat gttaccaccg 180 acgattcatt gtgaggatgt aaacccacag attgcgttgg aaggtagccc cttttatatc 240 aatacggaat taaagccttg gcagtctggt gacggtatac cacgacgggc tggtgtcagt 300 tcttttggtg tcagt 315 10 105 PRT Endobugula sertula 10 Lys Leu Gly Asp Pro Ile Glu Val Glu Thr Leu Ala Glu Ser Phe Arg 1 5 10 15 Val Tyr Thr Asp Lys Arg His Tyr Cys Ala Leu Gly Ser Val Lys Ser 20 25 30 Asn Ile Gly His Leu Gly Val Gly Ala Gly Ile Ala Gly Val Thr Lys 35 40 45 Val Leu Leu Ser Leu Gln His Arg Met Leu Pro Pro Thr Ile His Cys 50 55 60 Glu Asp Val Asn Pro Gln Ile Ala Leu Glu Gly Ser Pro Phe Tyr Ile 65 70 75 80 Asn Thr Glu Leu Lys Pro Trp Gln Ser Gly Asp Gly Ile Pro Arg Arg 85 90 95 Ala Gly Val Ser Ser Phe Gly Val Ser 100 105 11 736 DNA Endobugula sertula 11 aaattgggtg atccgataga agtcgagaca ttggcagaat cgtttcgagt ctatacggac 60 aagcgtcatt actgtgctct ggggtcggta aaaagtaata ttggtcattt gggggtaggt 120 gctgggatag cgggcgtgac caaagtattg ttgtctttgc agcatcgcat gttaccaccg 180 acgattcatt gtgaggatgt aaacccacag attgcgttgg aaggtagccc cttttatatc 240 aatacggaat taaagccttg gcagtctggt gacggtatac cacgacgggc tggtgtcagt 300 tcttttggtg tcagtggtac caatgcacat cttgtattag aagaatatac tcaccgagta 360 acatcaccat tacaaaatac tattttaccc cagaacggtt tgtttattgt tccactatct 420 gcaaaaaatg atgaatgctt aaatgcttgt gtcgaacgac tgttattttt tctaaaaagc 480 aggcaatccg atacatataa aaaatattcc ttaagtgata cagctcctat attgttagat 540 ttagcatata ccctccaggt cagtagggaa gcgatgacaa aacgagttgc ctttgtagtg 600 aaaacaacaa tagagttaat ggaaaaatta aatgcattta tagaaaaaca aaatactata 660 aaagcaagta atataaaagg ttgttactac tcttcgacta aaacatcgag tccatttgat 720 aatgaatcga ctgatc 736 12 245 PRT Endobugula sertula 12 Lys Leu Gly Asp Pro Ile Glu Val Glu Thr Leu Ala Glu Ser Phe Arg 1 5 10 15 Val Tyr Thr Asp Lys Arg His Tyr Cys Ala Leu Gly Ser Val Lys Ser 20 25 30 Asn Ile Gly His Leu Gly Val Gly Ala Gly Ile Ala Gly Val Thr Lys 35 40 45 Val Leu Leu Ser Leu Gln His Arg Met Leu Pro Pro Thr Ile His Cys 50 55 60 Glu Asp Val Asn Pro Gln Ile Ala Leu Glu Gly Ser Pro Phe Tyr Ile 65 70 75 80 Asn Thr Glu Leu Lys Pro Trp Gln Ser Gly Asp Gly Ile Pro Arg Arg 85 90 95 Ala Gly Val Ser Ser Phe Gly Val Ser Gly Thr Asn Ala His Leu Val 100 105 110 Leu Glu Glu Tyr Thr His Arg Val Thr Ser Pro Leu Gln Asn Thr Ile 115 120 125 Leu Pro Gln Asn Gly Leu Phe Ile Val Pro Leu Ser Ala Lys Asn Asp 130 135 140 Glu Cys Leu Asn Ala Cys Val Glu Arg Leu Leu Phe Phe Leu Lys Ser 145 150 155 160 Arg Gln Ser Asp Thr Tyr Lys Lys Tyr Ser Leu Ser Asp Thr Ala Pro 165 170 175 Ile Leu Leu Asp Leu Ala Tyr Thr Leu Gln Val Ser Arg Glu Ala Met 180 185 190 Thr Lys Arg Val Ala Phe Val Val Lys Thr Thr Ile Glu Leu Met Glu 195 200 205 Lys Leu Asn Ala Phe Ile Glu Lys Gln Asn Thr Ile Lys Ala Ser Asn 210 215 220 Ile Lys Gly Cys Tyr Tyr Ser Ser Thr Lys Thr Ser Ser Pro Phe Asp 225 230 235 240 Asn Glu Ser Thr Asp 245 13 312 DNA Endobugula sertula 13 cgattaggtg atccaattga attggcagca ctctcgaagg cgtttgagga gggaacacaa 60 cgaaaacagt tttgcggtat cggttcagta aaatcaaata ttggtcatct ggatgttgct 120 gctggagtcg ttggtctgat caagacagca ttgtcgctgc agcaccgttt gttgcctccc 180 acgatcaact acgaagcacc caatcgggaa atcaattttg aacaatcacc ctttcatgtg 240 attgatgaac tcacggagtg gcggggtcaa ggtggaccac ttcgtgctgg tgtcagctcg 300 tttggaattg gt 312 14 104 PRT Endobugula sertula 14 Arg Leu Gly Asp Pro Ile Glu Leu Ala Ala Leu Ser Lys Ala Phe Glu 1 5 10 15 Glu Gly Thr Gln Arg Lys Gln Phe Cys Gly Ile Gly Ser Val Lys Ser 20 25 30 Asn Ile Gly His Leu Asp Val Ala Ala Gly Val Val Gly Leu Ile Lys 35 40 45 Thr Ala Leu Ser Leu Gln His Arg Leu Leu Pro Pro Thr Ile Asn Tyr 50 55 60 Glu Ala Pro Asn Arg Glu Ile Asn Phe Glu Gln Ser Pro Phe His Val 65 70 75 80 Ile Asp Glu Leu Thr Glu Trp Arg Gly Gln Gly Gly Pro Leu Arg Ala 85 90 95 Gly Val Ser Ser Phe Gly Ile Gly 100 15 324 DNA Endobugula sertula 15 caattgggcg accctattga actgcaagca ctggccgatg tgtatagagt tgataactgg 60 cgcaaaaaca cctgtgccct cggctcggta aaaagcaata ttggccatac ctctgcggcc 120 tctggtgtgg ctggtataca caaggtgctg ttatcgctta agcatcgaca attagtagcg 180 agcctgcatt ttaatagcgc caatcaccac tttgattttc aacagtcgcc tttttatgtc 240 aatacccagc taaggccctg ggatcaagca gagggactag aagaaagccg ccgccgggct 300 gcggtcagtt cttttggtgt cagt 324 16 108 PRT Endobugula sertula 16 Gln Leu Gly Asp Pro Ile Glu Leu Gln Ala Leu Ala Asp Val Tyr Arg 1 5 10 15 Val Asp Asn Trp Arg Lys Asn Thr Cys Ala Leu Gly Ser Val Lys Ser 20 25 30 Asn Ile Gly His Thr Ser Ala Ala Ser Gly Val Ala Gly Ile His Lys 35 40 45 Val Leu Leu Ser Leu Lys His Arg Gln Leu Val Ala Ser Leu His Phe 50 55 60 Asn Ser Ala Asn His His Phe Asp Phe Gln Gln Ser Pro Phe Tyr Val 65 70 75 80 Asn Thr Gln Leu Arg Pro Trp Asp Gln Ala Glu Gly Leu Glu Glu Ser 85 90 95 Arg Arg Arg Ala Ala Val Ser Ser Phe Gly Val Ser 100 105 17 308 DNA Endobugula sertula 17 gagtatggag atccaatgga attgacggct gcagctgccg tctttggacg aggacgaaat 60 cagaaaaatc gtttgctggt cggatcagta aaagccaata ttagtcacct ggaagcagcc 120 gggggtattt ctggactgat caaagcagta ctggcaatgc agcatggcgt gattccacag 180 caattacact gcaaagaacc gagtcctcat atcccctgga aacgtctgcc tctcgatttg 240 gtacaagagc agactgtctg gccggaaagt gaagagcgga tcgcggctgt aacagcgtcg 300 gattagcg 308 18 101 PRT Endobugula sertula 18 Glu Tyr Gly Asp Pro Met Glu Leu Thr Ala Ala Ala Ala Val Phe Gly 1 5 10 15 Arg Gly Arg Asn Gln Lys Asn Arg Leu Leu Val Gly Ser Val Lys Ala 20 25 30 Asn Ile Ser His Leu Glu Ala Ala Gly Gly Ile Ser Gly Leu Ile Lys 35 40 45 Ala Val Leu Ala Met Gln His Gly Val Ile Pro Gln Gln Leu His Cys 50 55 60 Lys Glu Pro Ser Pro His Ile Pro Trp Lys Arg Leu Pro Leu Asp Leu 65 70 75 80 Val Gln Glu Gln Thr Val Trp Pro Glu Ser Glu Glu Arg Ile Ala Ala 85 90 95 Val Thr Ala Ser Asp 100 19 300 DNA Endobugula sertula 19 caacttggcg atgaaataga agttcgcgct ctgagtaaag tgtacggaga ttcacagtcc 60 acgacatacc ttggtgctgt aaaaagcaac ataggtcatg ccaacgcagg agcgggcatt 120 gctggtttta ttaaaacggt gctgtctctt taccatggca aaattgcacc caatgcaggc 180 aataccgagc ccaatgcagc tttgaacctt gacgcgtttc attttgcatt accaaaaact 240 ttgcttacat ggccggagtg tgatgttcga cgggcagcga tcagctcact gggttttggt 300 20 100 PRT Endobugula sertula 20 Gln Leu Gly Asp Glu Ile Glu Val Arg Ala Leu Ser Lys Val Tyr Gly 1 5 10 15 Asp Ser Gln Ser Thr Thr Tyr Leu Gly Ala Val Lys Ser Asn Ile Gly 20 25 30 His Ala Asn Ala Gly Ala Gly Ile Ala Gly Phe Ile Lys Thr Val Leu 35 40 45 Ser Leu Tyr His Gly Lys Ile Ala Pro Asn Ala Gly Asn Thr Glu Pro 50 55 60 Asn Ala Ala Leu Asn Leu Asp Ala Phe His Phe Ala Leu Pro Lys Thr 65 70 75 80 Leu Leu Thr Trp Pro Glu Cys Asp Val Arg Arg Ala Ala Ile Ser Ser 85 90 95 Leu Gly Phe Gly 100 21 304 DNA Endobugula sertula 21 gccttgggtg atcctattga atttggcgca atcaaggctg tgtatgggcc tggtcggtct 60 tctccgctgg tgctcggtgc acttaaatcg aacatcgggc atttggaagc gactgcaggt 120 gttgcagctc tgattaaggc agttttggtt cttcaacatg gcgtggctcc ggccaatttg 180 cactgtcaca aattgaatcc gcttctggat atcgacggct tcaatgttgt gttcccgcag 240 tctgagaccc ccttgcacag ctctctgcag ctacttggcg ggtatcagtt cgttcgggtt 300 tggt 304 22 101 PRT Endobugula sertula 22 Ala Leu Gly Asp Pro Ile Glu Phe Gly Ala Ile Lys Ala Val Tyr Gly 1 5 10 15 Pro Gly Arg Ser Ser Pro Leu Val Leu Gly Ala Leu Lys Ser Asn Ile 20 25 30 Gly His Leu Glu Ala Thr Ala Gly Val Ala Ala Leu Ile Lys Ala Val 35 40 45 Leu Val Leu Gln His Gly Val Ala Pro Ala Asn Leu His Cys His Lys 50 55 60 Leu Asn Pro Leu Leu Asp Ile Asp Gly Phe Asn Val Val Phe Pro Gln 65 70 75 80 Ser Glu Thr Pro Leu His Ser Ser Leu Gln Leu Leu Gly Gly Tyr Gln 85 90 95 Phe Val Arg Val Trp 100 23 314 DNA Endobugula sertula 23 acttggtgat ccctattgag gtgggggctc ttacagaatc atttcgatcc ctatacagaa 60 aaaaagaact actgtgcctc gggatcggta aaaagcaata tcgggcatct tttaaccgcg 120 gccggagtat ctggagtagt caaagtgtta ctcgctttga aacataagca acttccacct 180 tcctgtcatc tggtgaaaat caatgagcat atcaaccttg aggacagtcc attttatatc 240 aatacggcat taaagaaatg ggaagtatcg gaaggtgagg ctcgcagggc cgcagtcagc 300 tcgtttggtt cagc 314 24 103 PRT Endobugula sertula SITE (1)...(103) Xaa = Any Amino Acid 24 Thr Trp Xaa Ser Leu Leu Arg Trp Gly Leu Leu Gln Asn His Phe Asp 1 5 10 15 Pro Tyr Thr Glu Lys Lys Asn Tyr Cys Ala Ser Gly Ser Val Lys Ser 20 25 30 Asn Ile Gly His Leu Thr Ala Ala Gly Val Ser Gly Val Val Lys Val 35 40 45 Leu Leu Ala Leu Lys His Lys Gln Leu Pro Pro Ser Cys His Leu Val 50 55 60 Lys Ile Asn Glu His Ile Asn Leu Glu Asp Ser Pro Phe Tyr Ile Asn 65 70 75 80 Thr Ala Leu Lys Lys Trp Glu Val Ser Glu Gly Glu Ala Arg Arg Ala 85 90 95 Ala Val Ser Ser Phe Gly Ser 100 25 306 DNA Endobugula sertula 25 ccactcggcg acccaatcga gatggcagca ttaaaacagg cttttgggac tcaaaagaaa 60 aaatactgtg cgatagggtc ggtgaagagc aacattggtc atgccgatac ggcggctggc 120 gtcgctggtc tcatcaagac ggtgatggca ctcaaggcgc gtcagatacc gcctagcttg 180 cactttgaga cccccaatcc gcagatcgat tttgccgaca gtccctttta tgtaaataca 240 accttgaaag attggaacac caacggtgtt ccgcgccgcg cgggcgtgag ttcgtttggc 300 atcggt 306 26 102 PRT Endobugula sertula 26 Pro Leu Gly Asp Pro Ile Glu Met Ala Ala Leu Lys Gln Ala Phe Gly 1 5 10 15 Thr Gln Lys Lys Lys Tyr Cys Ala Ile Gly Ser Val Lys Ser Asn Ile 20 25 30 Gly His Ala Asp Thr Ala Ala Gly Val Ala Gly Leu Ile Lys Thr Val 35 40 45 Met Ala Leu Lys Ala Arg Gln Ile Pro Pro Ser Leu His Phe Glu Thr 50 55 60 Pro Asn Pro Gln Ile Asp Phe Ala Asp Ser Pro Phe Tyr Val Asn Thr 65 70 75 80 Thr Leu Lys Asp Trp Asn Thr Asn Gly Val Pro Arg Arg Ala Gly Val 85 90 95 Ser Ser Phe Gly Ile Gly 100 27 309 DNA Endobugula sertula 27 gtggtcggag atccgattga ggtcgtggga ctgacgaaag cctatcaagc gcacactcag 60 gaacgtcaat actgcggact gggttcggtg aagacgaata ttggccatac ggactcggct 120 gctggcattg ctggacttct caagatcgtc atggcgatga agcatcgtca actgccgccg 180 agcttgaatt ttgaaacacc aaatccagac ctggatctgg agaatagtcc gttcttcatc 240 cagacgaagc tgaaggattg ggaaagtgtg gggcctcgtc gtgccgcgtt gagttcgttt 300 ggtttgggt 309 28 103 PRT Endobugula sertula 28 Val Val Gly Asp Pro Ile Glu Val Val Gly Leu Thr Lys Ala Tyr Gln 1 5 10 15 Ala His Thr Gln Glu Arg Gln Tyr Cys Gly Leu Gly Ser Val Lys Thr 20 25 30 Asn Ile Gly His Thr Asp Ser Ala Ala Gly Ile Ala Gly Leu Leu Lys 35 40 45 Ile Val Met Ala Met Lys His Arg Gln Leu Pro Pro Ser Leu Asn Phe 50 55 60 Glu Thr Pro Asn Pro Asp Leu Asp Leu Glu Asn Ser Pro Phe Phe Ile 65 70 75 80 Gln Thr Lys Leu Lys Asp Trp Glu Ser Val Gly Pro Arg Arg Ala Ala 85 90 95 Leu Ser Ser Phe Gly Leu Gly 100 29 6000 DNA Endobugula sertula misc_feature (386)...(388) TAG may represent a transposase open reading frame. 29 gatggaactc attaccaccc acaaaaaagt ccgtttcttc aacgcggttg atttaattaa 60 ccagctaatc aacgaacaac aaaagcagca aacgggcaaa ctcatcagag ccttattgca 120 ggtggattgt ttaagtattg atgaactcgg ttatatccca ttccctaaat ccggtggggc 180 gttgctcttc cacctcatca gtaaacggta tgagaagacc agtattatca tcagcaccaa 240 tctggctttt ggggaatgga acagtgtgtt tggtgatgcc aagatgacca ccgcgttatt 300 ggatcgtatc acgcatcatt gttcaatcat cgaaaccaag catgcgtcgt atcgttttaa 360 gcagagtcag aaacagacat gaaagtagct ttcaccggtg ggacagtgtt agatgcaaac 420 cccgggtcag ctttaagtgc aatttgaaaa ccaatgtgat aattgtggct aagatcaata 480 aaaataaaat ttttttattg attatgatga tccacgttaa aaaaaatact ataaatatga 540 aataatattt caactttatt tttgatggtc gttgttgagg aattttttgt gagttatcga 600 gatattttga aggctttaca ggatgaaaaa attagttttg aagaggctaa atataagtta 660 ataaaaagaa aagataaaaa atcaaaacag cgtttaaatc atgatcgtga attaaatcga 720 tcgatgaata ttacgccaaa aatagtgaat aattacggtt tagtattatt gggcggtcat 780 ttatttgaag aactccgtct gagtgaatgg aaagctgcca accctaaccc taatgaagtt 840 agcattcagg tcaaggcatc cgccattagt tttaccgata ccttgtgtgt acaaggttta 900 tatccatcac actatccctt tgttccgggc tttgaagtat cgggagtgat tcgtcaagtg 960 ggtgaacaca taaccgactt acacgtgggt gatgaagtta ttgcgttcac aggatcatca 1020 atgggagggc atgctgccta tgtgacggtg ccacaagatt acgtggtacg aaaacccaag 1080 gacttatctt ttgaggatgc ctgtagcttc ccattggctt ttgcgaccgt ctatcacagt 1140 tttgcacggg gaaaattatc tcacaacgat catatcttga tacaaacggc gacaggtggc 1200 tgtggtttga tggcacttca gttggcgcgt ttaaagcagt gtgtgtgtta tgggacctcc 1260 agccgagaag acaagcttgc actcctcaaa cagtgggcac tgccctacgt cttcaattat 1320 aagacgtgca atattgatga ggagattcaa cgcgtcagtg gtcatcgagg tgtcgatgtc 1380 gtcttaaata tgctcccagg agagcatata caacaagggc tgaatagttt agccaaggga 1440 ggccgttatt tggaactgtc gatgcatgga ttgttaacga acgaacctgt cagtctgtcg 1500 tctctgcgtt ttaatcaatc cgttcaaacc atcaatttac tggggttact caataagggt 1560 gatgatggct ttatcgggtc tgtattagcg caaatggttt cctggattga atcaggtgat 1620 ttagtgtcaa ccgtgtcgcg tatttatccg ttggatcaga tcggtgaagc gttacgttat 1680 gtctctgaag gggagcatat aggtaaagtc gttgtgagtc atacagcgac agagccgatg 1740 gattgcagac agcgctgtat tgacaatgta ttgaagcaag ggcaaatggc ggccttgacc 1800 gcgacagggg gaaaaagccg ggtgtggggt ggtactggtg tcaatgacaa accgtctcct 1860 gctgttggta tagaggagcg tttattggaa gggatagcgg tgattggtct gtcaggccag 1920 tatccgaagt cgaagacact ggagcaattt tggcagaccc tagcggatgg agtggattgc 1980 atctcagaga ttcctgctga tcgctggtcg ttagaagagt attactcgcc aataccggaa 2040 gggggtaaaa cgtattgtaa gtggatgggt gttttggagg acatggattg ttttgatccg 2100 ttgttttttg cgatatctcc tcgggaagcg gaagtgatgg acccacagca acggttattt 2160 ttagagaatg catggagttg tatagaggat gcggggatta accctaagat gttatcccgt 2220 agtcgatgtg gggtatttgt tgggtgcggt gcgaatgatt acagcgctct aatgaacagt 2280 agccactcaa cgagtctcga attaatgaag gaattaggca acaactcttc cattttatct 2340 gcacgaatct cctacttttt aaatttaaag ggcccttgtc ttgcgattga taccgcatgt 2400 tcttcttcat tagtggccat tgccgagtcg tgtaatagtc tggtgttggg tactagtgac 2460 ttggcgttgg caggtggagt gttgctgatg ccaggtccat ccttacatat aggtttgagt 2520 catggagaaa tgttatcagt agatggtcgc tgctttacct ttgaccaacg ggccaacggt 2580 tttgtacctg gagagggtgt cggcgttgtc ttgttaaaac gcatgtcgga tgcggtgcgt 2640 gatggtgatc ccattcgtgc agtgatacgg ggctggggtg tgaatcagga tggtagaagt 2700 aatggtatta cggcgccgag ttcaaaagcg caaagtgctc tggagcaaga ggtttatcaa 2760 cgttttaata ttgatccatc gagcattacc ttagtcgaag cacacggaac gggcaccaaa 2820 ttgggtgatc cgatagaagt cgaggcattg gcagaatcgt ttcgagtcta tacggacaag 2880 cgtcattact gtgctctggg gtcggtaaaa agtaatattg gtcatttggg ggtaggtgct 2940 gggatagcgg gcgtgaccaa agtattgtta tctttgcagc atcgcatgtt accaccgacg 3000 attcattgtg aggatgtaaa cccacagatt gcgttggaag gtagcccctt ttatatcaat 3060 acggaattaa agccttggca gtctggtgac agtataccac gacgggctgg tgtcagttct 3120 tttggattta gtggtaccaa tgcacatctt gtattggagg aatatcttcc tcactcgaca 3180 ggaacaatag agtcgtttgc tgcgaatcat gcaagtacag ttattattcc tttgtcagcg 3240 aaaagtcata atagtttata cacatatgct caaacgctat tgatattttt aaaacgtagt 3300 caggttactg acgctaaaaa aatcacaata gatcacatgg aatgtcgctt gttggattta 3360 gcctatactt tgcaagtggg tcgcgaggca atggacaaac ggataagttt tattgtcaac 3420 acaaagcaag cactcgtgga aaagctaaat gcttttctag agaaggaaaa gactataaca 3480 gactgttacc actatttatt tgatagtgac aaaccgtcta cagaaatttt ccgtttagac 3540 gaagatgaca aagtattaat aaacagctgg ataagtcaaa gtcaatatca caaattagcc 3600 gaagcctgga gccaaggact cgatatcgac tggacgctac tctataccca ctcatcaacc 3660 cctcgtcgca ttagcctgcc cacgtatccc tttgccagag accgctactg gctaccagaa 3720 aaaccacgct ataacgcggc taatcatccg gtatccaacc atcaaacaac cactcagaat 3780 cactcacgct ttgccattga tacggatcac gatgtcgttg ccgagatcat gcaaaagaca 3840 catcaacagg aactggaaca atggttatta aaactgttgt ttgtgcaatt gcaacatatg 3900 ggattatttc aacatcgtgt ctttgagaca gcgaccgctc tacgtcaaag tgcaggcatc 3960 gttgataaat atgatcgctg gtggcatgag tgtttaagcg ttttacagga tgcgggttat 4020 cttgaatgga aagacgatag cgtagccgcc gcacaggcat tggagtctga atcgcaagag 4080 gcatggtgga gccgatggaa cacggagtat aagcattacc agaatgatcc ggaaaaaaag 4140 acgttagcga tattgattaa cgattgctta caggcattac caggggtgtt aagtggtgag 4200 caattaataa cggatattat tttccccaat ggttcgatgg agaaaatgga aggcttatat 4260 aaaaataata ggattgcaga ttattgtaat cagtgtgttg gagacctgct cgtccagttt 4320 attgaagcac gtctgtcaag agatgccaat gcgaggatac ggattatcga aattggggcc 4380 ggtacggggg gcaccaccgc gatagtgctg ccaatgttac aagcctatca ggatcatatc 4440 gatacgtatt gttatacgga tgtttccaaa gcctttttga tgcatggaca ggaacactac 4500 ggcgaacaat acccctatct gagttattgc ctctgtaata ttgaacagga cttagtggct 4560 caaggaatca gcgttggtga ttatgatatt gcgatcgcag ccaatgtatt acatgccacg 4620 cggaatatac acgaaacggt cagccatgtg aggcaggcat tggcggccaa cggtttattg 4680 attttaaatg agtttagcca aaaaagcgtt ttttcgagtg tgatatttgg tttgatcgat 4740 ggttgggcct tatctgaaga tacgggattg cgtattcctg gaagcccagg gttatatcct 4800 aagcagtggc aagcggtact ggaggcgtcg ggttttggtg acgtggaatt tccgctccat 4860 gacgctcgtg agttgggtca acaaatcatc ctggcaacca acgcccatgc gaacgttgct 4920 agcgatcttg cgacatcggt gattgatcat gcccccaaga gattgccatc cgccgaggtc 4980 agcatggatg agagagtgag ccatgatgcc atgatgaagg catcggtcaa acagttgttg 5040 gtagagcaat tatcccagtc tttaaaactg gatatgaatg agattcaccc tgacgaatcc 5100 tttgccgatt atggtgttga ttccattacc ggtgctagtt ttattcaaca gcttaatgac 5160 acgctgacac tgactttaaa gacggtgtgt ttgtttgatc acagctcggt aaaccgactg 5220 acggcctatc tgttatctga ctatggtgat gatatcgcgc agtggttagc aacggcacca 5280 gcgttggttg atcatccaca gagtgtcgtc agtcaggtgt tgcctgaaag gtcgccagca 5340 agcacacaag ccaagccctt gccttcagtc cccccttcgt tatcgatgga gtcacccgtt 5400 caacaggagt cgatagcgat tattggtatg agcggacggt ttgcggcgtc agaaaacctg 5460 gaagcgtttt ggcaacagtt ggcacagggt gtggatttgg tcgaacccgc gtcacgttgg 5520 gggccacaag cggagactta ctacggcagt tttctcaagg atatggatca atttgatcct 5580 ctctttttta atctctccgg tgtggaagcg agttatatgg acccgcaaca acgttgtttt 5640 ctggaggaat cctggaatgc actggagaat gcgggttatg tgggtgatgg catagaaggc 5700 aagcgttgtg gtatttatgc cggttgcgtg tccggtgact acgcacaact gttgggcgac 5760 caacccccgc cccaggcttt ttggggcaat gccagttcta ttattcccgc ccggattgcc 5820 tattatttaa atcttcaggg ccctgctacc gcggtggata ctgcctgctc aagttctctg 5880 gtggcggtgc atttggcctg ccaggcccta cacctggatg aaatggagat ggccttggca 5940 ggaggtgtgt ctctttatcc aacccccatc attgtatgag tctttgcgtg gtgcagatat 6000 30 4700 DNA Endobugula sertula misc_feature (1)...(4700) N refers to any nucleotide. 30 ancaatttat nacatccncg ggaaaanacg aacggtcacc atntaggcag gcattgcggc 60 caacggttat ttttttaaat gagttaacca aaaaagngtt tttgnagtgt aaattggttt 120 gncganggtt ggccttattt aananaggga ttgngtattc ttgaaaccca gggttatttc 180 ctaacagtgc aancggtact gaggcgtcgg ntttggttac gtgaatttcc gctccatgac 240 gctcgtgagt tgggtcaaca aatcatcctg gcaaccaacg cccatgcgaa cgttgtagcg 300 atcttgcgac atcggtgatt gatcatgccc ccaagagatt gccatccgcc gaggtcagca 360 31 5686 DNA Endobugula sertula misc_feature (1)...(5686) N refers to any nucleotide. 31 gcncttnccg cggtggcggc cgctctagaa ctagtggatc ccccgggctg cagtattcgg 60 aaatgcaggt caatcagatt attcaacggc aaataaattt atggatgagt ttgcacgcta 120 tcgtaatgct ctggtcaatc gcaaagagcg ctatggttta acactatcga ttaattggcc 180 gtactggaga gaaggaggta tgagtattga ggaaaatttt gaaaatataa tgcaagagaa 240 taccggtatg tccgccctgg agacatcaca aggtattgaa gtattacaaa gagcttggca 300 gttgcagtac acgcaattgt tggtaatggt cggagagatg aagcgaatgg agagcttttt 360 gcacaagcag ggtttcgagc agattcctgt ggtatccgcc gatactgtca gcgagaataa 420 aacctcgact attgagaatc tttcagccga tgtagataca ttaccattca ttgaggttca 480 ggcatacaat atggaacaaa aaacccttga ttacttaaaa aatgtatttg ccaccacaac 540 acaaatcccc gagaaaaata tttatgttca tgaaacattg gataaatacg gagttgattc 600 attgttggtg atgaaaatga ccaatcaatt ggaaaaagta tttggaaaat tatctaaaac 660 cctatttttt gaatatcaaa ccattcgcga actgggcgat tatttcctga aatttcatga 720 tgaaaagtta agggagtttt ttcagataga tagcaaacta tctatgttaa ataatcacgg 780 agagattgaa gttcaaaaaa aaggggatga accatcggtt ggagacagat ataagtcagc 840 tggatgccgt gcctatctcg gtttatatcg cctgtgtcag cagtgaatca tcaaccaaaa 900 aaatgttaac aatggttccm atantcatca gccagtaatg ggatattggc gawtattggg 960 tctgagkggg tcgttattcc mcaagcctga gaaatatngg agggaatact ggggaagaaa 1020 tttgtgtcaa nggcaaggga ctggtattan cnggaaantt ccaaanggag ccgttgggga 1080 ttggsaagac tattwyacms mtnnngatcc stattcagcc mggtgggaca tcgcagtaaa 1140 tnggggkggt tttattcggg atgttgataa gttcgatccg ttatttttta atatttcccc 1200 tagkgrggkg gagctyrcts atcctcagga aykwttattt yctagrgtcc gcgtkggctg 1260 cattggaaga ccctggawat tgccgggnat tatttgcaaa tgttgtcatc aaggactaaa 1320 tcttcattct cgtcggraga tgttggtgtt tatgtggrag tratgtcttc agaatatcag 1380 ttgtttgctt ttgaacagaa wttacgtggt caccccatat cctcnggttg ggagttatgc 1440 cagtattgct amccsggtgt cttatgtttt aratctacac nggcccaasc atgacagtgg 1500 atmcgatgtg ktctarttcg ttaacgacgc twcacctagc atgkcaggga tttaaaactg 1560 ggkcgaaact gaccygggta ttgkcggkgg agttaawatt accattcacc ccmataaata 1620 tyaggcsctg agtcacgcyc aaattattty tactagtggt sgttgccaaa rttttggtga 1680 acagggacag ggttatatcc ctggtgaagg agtgggtgcc ataatactga agcgcttggt 1740 cgatgccgag cgtgacggtg atcatattta tggtgttgtt aaaggcagtg ccgttaacca 1800 tggtggtaaa accaacggct ataacgttcc taatccgaat gcacaacagc aagtggtgag 1860 tcgtgcacta cgagaagccg cagtaaaccc ccatcatgtg acttatattg aggcacatgg 1920 aacaggaacc caattgggtg acccgataga aattactgkt ctrammaaag cgttcaatag 1980 tttgaccaat gagcttggtt taagcgctgt gsccaaacma tygkgtttga tcggstcark 2040 gaagtcaaaa tatagggcat tgtgagycas caagccggtg ttgcagctat tagcaaagta 2100 ttgttacaaa tgcaacacgg gtcaaatagt cccttcttta cattcaaaag cattgaatcc 2160 caatattgat tttactgtga ctccctttgt agtaaaccaa gggttattgg actggaaacg 2220 acttgaagtt gaaggaaaga gggtrccgag aatkgctkky mwwwckkytt ttggggccgg 2280 tggctcaaat gcccatgtag tgattgagga gtacgttgcc agcaatgaaa agcaagagga 2340 ttttcaagga aaagtaatta tccctttatc ggcwatagac ttskgatcar ctacaaraaa 2400 warkggatcg tttgcttaag tttatcraaa aaaatgaagc aaaraggtag ggaawtksgc 2460 ttaattgwty ttgccgwawa cattgcaact tgggcgcgag gtcaatgara ggaacgtctg 2520 gncmttngan ttgtaggaat cnaataccaa atgcttaang gaaagatttt agcaaaggnt 2580 ttaaatactc agaaaatnga tgcacanatt tttcggatac ttatcaaaag rcattttatc 2640 ggggttcgta ctagacctgg gtgcgttgra tttcgctatt ttttctgaag atgaagaata 2700 tggccaacac gcttgatatt ttggattcaa aaaggtaaat actttaagnc tggcggagct 2760 ttgggtaaaa ggtgtgacta ttgattggaa taaatggtat aacgcattat taacccagaa 2820 taaatatttg aaaccntcgt cgtattagtt tgccnaacng tatccttttt ccagggatcg 2880 ttattggatt nccnaagtgc ttttccacaa ncaaacattt tctacagtaa ttgaggcaga 2940 cgccaaccma aacattgaat gagctactgt gttttgaaga aaaatggcag gtgcaatcgg 3000 aactacatga ctctgttgca gatcaatcta atgttatcaa tacattaatt tgttttttaa 3060 ctgagaaaga gcatcaaaaa gcattacaac aatcaatatc attccatagc ccgaaaacac 3120 gattgatttt tatcagccag gctcaggctt atgagcagta ttcatcagat cactatgcgg 3180 ttaatccaga aataggaaag acgtaccaac aggcttttca acacattgtg aaaagtattc 3240 ataaaagtga tgtcacggac ataatgtatt tatgggctct agaggatgaa cgctggatta 3300 cgtctcctct acctattgta tatcttttaa aaagtattga ggtttcttta ttaaaaccar 3360 aaaaattact atttgttgga gaatttaaga caagcttakc rrcgaytgty acyykraakc 3420 cwrgkkgggw ttygmamrwy ckkwaksgtt dgtgcaacsg ratwtkragg ttgcggtgtt 3480 attaraggcm rtggaaggta ctyaatccca tmcagtgaca aagcaaatgg atctttggat 3540 agaaaaattg tggtcgtcct taaaagccca aaaagttcat agtagcttat accaaaatgg 3600 tcgtagatat ttttctgaaa accccamccg ctgcaanctt gtcatgaacc aaagtattca 3660 aatgcttaca gggracttta ttgataacag stgsytgtgr aggactgggt tttgtcttyg 3720 cagattattt ttccaagaca tataaaatta atctgatatt ggttgggcgc tctgatcttg 3780 ataaagagaa agswwtcgsr ratwcrgrmt ykgkwwmaat caggtagtcg agtggcttat 3840 gttcagacgg atatctgcga tgaaaagaat ctccaattgg aattggatat tgcccaaaaa 3900 tattgtggcc ctattcaggg tgtcattcat gccgcgggca tcattgatca gaagacaatt 3960 tttgaaaaaa gtcctgaaaa ctttcaagca gtattagccc ntaaaattca gggtacattg 4020 attctggata acgtattgtc agcgcaatca ctggatttta tatgttactt ttcttcaagc 4080 tcggctctat taggtgatgc aggatcatgt gattatgcaa tggctaatcg atttttgatg 4140 gcccatgcac agtatagaaa tacctyggta tctgaargaa aamscaaggg raagacmctg 4200 kttwttcatt ggcccgcctg gaatgtgaaa ggaatgggat tgaatggact ggaatgagaa 4260 cgtgaaamca ragttctwty ttaagtccaa gcgggcaasg tctattggac ataaaggaag 4320 gttgtgaggt tattgaacac attrctggct caggattatt ytcagtgtcy tawattggst 4380 ggkaggaaaa accngtatcw aacaattttt tgggtctcac acaaagatgt ttctnacctc 4440 acaagtgagt caagggcagg magtrawgaa cwwasrrswk kmykkrrass ksyamyaaac 4500 gagctgagat agaagacttt aagtgttgaa gaatgtatta ttttggactt aaaaactctg 4560 attacagagc aacttaaaat acccatcagc tcatctggat gtagagagta atttagcaga 4620 ttttggtttt gattcggtca gtttagcaaa cttttcccgt gstttaagta ttcmctatca 4680 ttycaawawt acgccrtstk tatttttcgg atatcctacc atagagcgty taarccgtta 4740 ttttttaaaa gaacmcmctg cgsttatgga ggcgttttat cagcagaaaa aaacatytwa 4800 tagtaacaat acvctgtccg ntatagtccy tcatgtcaaa gaaaagccgw caactgatct 4860 aatatcatcc arcngcctct nccttttatt gcagatccat tgccccctca ggstattgag 4920 agtattgatg agcctattgc cattattggt atgagtggtc gttttccaga agcgcgtacg 4980 gnttaaagca atgtgggaga ttttatccga aggtaaaagt sytgtgcagg agattcctat 5040 agagcgcttt anattggcat gaatattatg aacacccatc ggatgatgtt ygaanaandb 5100 taatagtaaa tggagygcct gcattcctgg tattaaagaa ttcgatccac aatttttcga 5160 aatttctcca agagaggcaa aaaarctgga ccctcttcaa cggcwcttat cacaggaatc 5220 mtsgaatgca ttggwaaats ctgcttatgk wwwmywacrc wkwgmtmwtw aracratggg 5280 atayktkkat tggtrttgaw smaggktwtt atmmrrrymw gmtcaatkmr gwygacsgca 5340 cacwttwawc catmakrmta ttttrgcata ccmgtytgsc agtwytywtt arakyttaat 5400 ggscmwrssa tggcwrtwaa wrccgcwtgy tcctccgsyw tggyygcrmt tcaccamgct 5460 kscsysagtt tackwcarca agcaatkyga wrcgsckawk gwcscggcag cwwwyttrmw 5520 mwwyacrssk sawswtkaws tggscwtgay ssawgsgrgy mtgakmysac mwgawgsyat 5580 amygawakac ckarnrtcam csygccaaks gcryagtgmy tggakagsmw gytgwtgcar 5640 tcgtaytgma acrwmtcttk sgggktttcc aaaaggggtt mmaaat 5686 32 4744 DNA Endobugula sertula misc_feature (1)...(4744) N refers to any nucleotide. 32 gngatgagat tgatgagaat acttaatttg gtcgaanagg ccattacntc tatgattctt 60 ggtgaattta taagccaatt aaccngtgat ttagtttgga atatgaaaga acccgtttta 120 tttgactatc ngaatattaa tactttatcg aatatgatcg agaatgaact cgaagctgtt 180 gaggtatagt tatgttagaa gttattaata gatactgcca tggatacgta ttcgtgccag 240 tggtattggc cntagaagaa aaagggtttt ttgacctttt tacaaggaat agatacctta 300 catttgaaaa aataaaaaca gaattaaatg ctaatagtgg ccatcttcaa gtagccttac 360 gcatgttgca gtctgtttca tggatatcat gtgatgataa agggtatgta ctaacagatg 420 cagcggacga aagaaataaa atatctagtg attttataga gctttttaat ttctctatga 480 gtcgctattt agaaaatatg gaaaggcatg gattaaaaaa atggatagat caatccggag 540 ataactgggg tatttcaaac cctgtattaa ccgatttttt ggatggtgtt ttaattattc 600 ccttattact agaactgaag gaaaatggtt attttgatgc gttaaaaaat gkwaatagtc 660 taaataaaaa attattttta ggntgatatc gaacaatcgg nttcgcaawg aaattattac 720 actattttaa acaaaagaac tggctccaag aagaatraag agacgtttta cttcacaaaa 780 ntctggtcaa tttnaycact caacgaattt ttattaccgc aatccattgc ttcttataag 840 cccatgttta tctcgggata acggaattaa tgtttggtaa tgctaggagt atttttaaaa 900 agggattgca tggagaggag agccatgttg accgaacctt aaatgttatt ggtagtggtt 960 ttcaacatca aaagtacttc gctgatatcg aagcgttagt cattcagtta tttaatgata 1020 mtttktacga tsraywsccg aaatrkrtts crratatggg ttgtggtgat gggactctac 1080 taaaaaatat ttacaatatt atcaaggaaa aatctgcacg aggaaacgtg ttgaatcact 1140 atcccgtggt acttattggt attgattata atgaagccgc tttgcaggaa actaacaata 1200 cactggcagg tgttgataca agacactatg ttttaaaagg cgatattggt gatcctgaag 1260 gaatgataag tgatctatat gatttaggta ttaaagatcc tgagaatata ttgcatgtgc 1320 gttcatttct ggatcatgat cgtccttata ttgcacccac agaggtgatg aatattgaag 1380 cacgttcaaa gatatttgat cagggcgtgt atgttgattc agaaggtcaa gcaatatcgc 1440 ctgtggttat gatacaaagt ctggtggaac attttaaacg ctggtcttgt gtaaagacga 1500 aacatggctt gcttatatta gaagtacatt ctcttaaccc tgaggttgtc aaccaatatt 1560 tggatgaaag tgaaagtttg cattttgatg cctatcatgg tttttcctct caatatttag 1620 tatcggctga ggattttcta atatgtgctg cagaagctgg tttattttct aaacctgatg 1680 tttctcaaaa ttatccaagg aacttacctt ttactcgaat taccctaaat ttttttgaaa 1740 aaaagcctta tcaaattcgt cacccgaatg aaaatgattt gtctgcattg atggatttag 1800 aaaaaatttg tcgacctaat aatcaatgtt tatgcattga tgaccttcgc caacgcatag 1860 atgaataccc aaaaggtcaa tgtgttttag aattaaacaa taccattgtt gcagtgattt 1920 attcacaaaa gtgtattaat agagtgttag gcactgctgc aggtgtttgg carswswwtg 1980 scmdhggaat rtgbdwdcac datttvtaba thactbgttt atcaatdtaw trcccaaaat 2040 aaaaaaagaa tatgccatmc aattattaca gtttatcttc tatytatcat ggtgttcawa 2100 atgatgttga agatgttatk ggtattgatg aatgttatca gtgcttaaat gagaaaacga 2160 tacaagcagg cagttttatg gaaagtgagt cagttgatgt tttatattcc aagagtagaa 2220 aaacatattg ctaagtatcc caatagatat tggagtaaat gctctggatg cagagcagga 2280 aatggggttg tttggtgcta agtggttact atctattttt caaagccaag gagtgatgaa 2340 aaaatcaggt gagtattatc aaaaagatca attngaggtt gatgttaaat attattccaa 2400 aatattatcg attatttgag tgcttgctac tcatatttng aaaaaagaaa gcttatttca 2460 attcaaaaaa atacnggtgc aaacactttc caatattgat gaatttgctc ttaacgatcc 2520 attggtntga gtttgcttcg tnttaagcgt acgttttcct ctcaatatgc tagccttatg 2580 ccgwttctac gattaatggc atcgtgcctt tctcggtatt tggaaatatt aacaggcaaa 2640 atacaggcgc atgacattat ttttccagaa nggagggatg aatttatttg aaggtatttt 2700 taaaggctat caactttcag actattttaa tcatattctc gcagagctga tttatgaaag 2760 ggctanacgc tctatccggt gggtaatatg aantaaaaca attcgtattt tagaaataag 2820 gagcaggtac ctggtggtgc caacagagtt tgtattngaa tagnagcttc mccgctnctc 2880 gaatggttat aagagtttta cntatactgg atatctncgt ccntcgttcc ttcgttatgg 2940 gagaaaagtn agattttycc gataaatatn ccctggtntg caatataagg tgttagatat 3000 ntgaaagnca atttagantg cacaagggtt ttaccctgat agctttgata ttngtgtatg 3060 catctaatgt tnctccacga tacgaaawta tatacagtat accctttccc aaagtgagtc 3120 acatgctaac gcaaaatggc nttgttaatg ttgaatgaan tttactcngg atgaanggat 3180 ttgttactgt ttaccggtgg tttgttagat ggcctttggt tatatgaaga ccctaccaat 3240 cgattggata atgtctgctt gttaaatgtt gatcagtggc gatctatatt atttaaatca 3300 ggctttnaaa aatgttaaag actttgtttt accttttgaa aaacttaata ttgagcaaag 3360 tcaaagtatt attgtctctg agtggattaa tgaagacctg tctagtaatg nttgaaaatg 3420 tggtgaaaaa taatcanttg tttnagaaat acaaaatcac tcntgatncc gattactngt 3480 ggagnaataa aattagntta caattnaaaa gacaantcmc wtcgttanca caatagtatt 3540 ggaagaaaat atttttataa aattttagng gggataaaaa gaaaattatn ggatttttct 3600 ccntaaacgc ccctttgatt ggagtttatg ggttggattc atattcgaac ctacnttgga 3660 anttaaagat cattactcgg kragcmttyt tcyataaaac trgaasmtac tttkktmtky 3720 mawkatkraa yrmtksckkm rsctmtytgw kwcmtccsay atsattcmag wtrascytsr 3780 wattrtcgmt arakwcccta ttacggaaga gataatgact ggaggtacgt caagggtaar 3840 aacagggcaa tcgaatsaka atgaacctat tgcgattatt ggtatgtcyt gtttatttcc 3900 aggtgaggtt acgacagttg atgagttctg ggaattatta atacaagaaa gacatgccrt 3960 tcaaccctta cctaagggac gttggcaatg gccakaaggt gttgatccat cgggagcaca 4020 acttggcatt gatcagggtg gatttctgga tggtattgat acctttgatg ccsacttctt 4080 tcgtatatcg agaaaagaag cggagttwat ggaccctcas caaagaaaac tacctggaat 4140 taarttggca ggtcatasag catgccggat ataaacccat cggytttttc tggtcaaaga 4200 natyggyatc tatgtggggt gctttgtcac cggtaattta tatgggagtt atttaactaa 4260 aagtgaccaa angccctaaa aaccaaccgg naaggcctat ttkcatgacc argtartana 4320 ttgttgttcg tytttmcccc aataanaatt ttcctatttt ntattaattt tttaaargtg 4380 cccmscstcc tctwtctgat wccgngcttg ttcaaryagt tttaggttgc ctwtttgacc 4440 caancarttt tatgcgnatt caattcgggg nanggngtga atcaggcntc tggtgggntg 4500 gggaycaatt waatrtcccc tccsmrtgaw accggtttct tnattayywa gcaggtntgt 4560 tntcaaaatc ngggaatgta aacctttnga tccaccgccc gttggttttn tncctgggna 4620 aagggggcgc tnttcttttt ttnaatcntt ttctcanccc nattttaaaa ngattgtttt 4680 ttnggggttt taaagggggg agatnaaaat ngggggcaan cattnnttac ggccctaacc 4740 tnng 4744 33 1954 DNA Endobugula sertula misc_feature (1)...(1954) N refers to any nucleotide. 33 gangattcct nccnctnccc attgaaaaga ggatggattn gancatatgg gtgtgcctgc 60 aagaagataa gtcaatataa tgtaactcag aaaaatcaat tcccaaaatg aataccccnc 120 aatcwataca aaaaawattg awagattttt kggtkgacat tactaacttt ttsgaggcna 180 agacatcmat ccmrgcmgga tgcctggtga ctatggtgkt gattccatta ttaggtatga 240 gatttyttaa tcgaattaac cyccaccttt aawatagaag ctgatgcttt attactaaca 300 gaaggaacga ttmaccagta tatctcataa arkwcmttct tttattgttg ataaaaaaaa 360 ttacccaatg ttaccaaatt ttggattaga aaatgattct aataaagaaa ataaaggctg 420 ggtaaagcct tcttttattg aatttattaa atttgaaatc aatcctgaat atatagaaag 480 cagtacaaaa aataaagatt acgcgattct tgaaaatcta ataaataatg gagttggagt 540 ttggagagaa aataatcatc tatgttttga gtttttttat gaaactcata caaatgaaac 600 aattaaaaaa atagtgtttt cacccgaaat actttttaac tctctagata aaggtaaacg 660 atactttcca agtagctgcc agcaaaaaaa cagtctatat caaacggaag ttgagaagtt 720 tccatataat cttattcaag gatttagagt ggaaatgcca gtcaatattg aaattttaaa 780 taaagcattt aatcatttgg ttaacacata ttcaattttc agaacaaaag caatgttgat 840 caataagcaa tggattcagg taatacatga tggtttatca gtaagatgcg aaganaatta 900 yatacgaagg attatctgca ggaaaaagat tttacgcaac aactaatnag tatttcaaaa 960 agagcaaggt aaaaaattat ttgatatcga taatctgcct ttattaaaaa tttattttat 1020 ccataatggt aaagacttag cagctatttt tgttcatgcg catcattttt gtgccgatgg 1080 atttacattt ttttcttttc agaaagaatt tcatgatact trtgaaagta ttatraacgg 1140 antggrrwat ccggaaacgk gttcsawaaa gtgatggctg aatatggcca ctttgcattg 1200 tgtgaatata atcccaaaaa caaggagctg acaaaaaact ggcttgataa aattcgagat 1260 aaaaattttt ctttaaaatt taaagataag aaagactatg tcggtcaact gtcaagtgga 1320 gaaaaaatta ttgagctaga agtttctgta aatatgctgg aaaaattaag attatttaat 1380 gatgcgaata ataccacact gacgcaattg ctatgttgtg ctgttgcaat tttactgtat 1440 cgcctctcga ggctaccagt acccttgcaa atggtcaaca gccgtagaga taaaatagaa 1500 tttgaaataa tgatgggtga ttttgcatca actctgccct atggatttta ggaacctttc 1560 caaaagcatt ttctctattc cnggatggta ccttttttaa gttattggaa aaanggaaaa 1620 aggcnttnaa ttntcccccc naggattttt taaanggggt ttggatnntt tntcngggaa 1680 ccctcaanaa aaaaaaaatt tntttccaaa aaaaaaaggg gccccttaaa ntccccatta 1740 agggaatttt ttaaattttt taatttcccg ggnaaaatta tttntttaaa ttccggaatt 1800 aaggccnaan tggaattaat tggnaaaatt tccantttgg gtttttaaaa aggggaaaaa 1860 ncccannaat ttgggtttcc ttaaaaanaa aaaaaagggg ggnggccccc cggtgggttc 1920 nttnntgggg gnaaaaattt aaaaatttaa tttn 1954 34 2672 DNA Endobugula sertula misc_feature (1)...(2672) N refers to any nucleotide. 34 anccgaaaaa naccnaaagg gnngccggcc cntgtcctnc gagtgcatna taaaaaancc 60 agtnataagn nggnnacaat antcatgccc cgcgcccncc gnaagnaacc tnantgggtt 120 naaggcttca agggcatcgg tcaaggaacc tttcggcggg cttttgctgt gcgacaggct 180 cacgtntaaa aaggaaataa atcatgggtc ataaaattat cacgttgtcc gggcgcggcg 240 acgaatgttc tgtatgcgct gtttttccgt ggcgcgttgc tgtctggtga tctgccttct 300 aaatctggca cagccgaatt gcgcgagctt ggttttgctg aaaccagaca cacagcaact 360 gaataccaga aagaaaatca ctttaccttt ctgacatcag aagggcagaa atttgccgtt 420 gaacacctgg tcaatacgcg ttttggtgag cagcaatatt gcgcttcgat gacgcttggc 480 gttgagattg atacctctgc tgcacaaaag gcaatcgacg agctgsrcym scrmaktygk 540 gmcmccgkmw cctwmrarst twttcscaaw rragkktywt tmawmaagsm cscygskrky 600 gswwwtggwr ctawccacgm arcssmwwty gaaamaccks rkcyggntkw csrawawmwa 660 cmrsmycasc cttggwawmm armrwsmtga syywgckcwg aamaakgtwa ccstcrgkgc 720 cgmtwwgkkc aawkttwmac cysrwrwwrr ymcmaamatt garrcsttgm ycgraaccsc 780 gmtgaaaaan ncgctghntg nnaatgtrvg gcgtntggat gtchcaaagc aaatggcasc 840 agacaangaa agcgatggat gaactnnngg cttccttatg tccgcccggc caktcatgat 900 ggaatgtttc ccccsggtgg tgttatctgg caccagtgcc gtcgatagnt antgcnaant 960 tngantaant tnattnatca tttngncggg ntcctttncc ggncgatccn gccttgttta 1020 cggggcggcg acctcgncgg gttttcgcta tttatgaaaa ttttccggtt taaggcgttt 1080 ccgttcttct tcgtcataac ttaatgtttt tatttaaaat accctctgaa aagaaaggaa 1140 acgacaggtg ctgaaagcga gctttttggc ctctgtcgtt tcctttctct gtttttgtcc 1200 cgtggaatga acaatggaag tcaacaaaaa gcagagctta tcgatgataa gcggtcaaac 1260 atgagaattc gcggccgcat aatacgactc actataggga tcatatttat ggtgttatta 1320 aagggagtgc catcaatcat ggtggcaaaa ccaatggcta tagtgtgcct aatccggata 1380 agcaacagcg tgtcattagt gaggctttgc agcgggctca aatagctcct catcaagtca 1440 gttatgtaga agcgcatggt gcgggaagcc gtttaggcga cccaatagaa attacggctc 1500 tcagcaaagc atttaacaat gttagtgcgc aatttaatgt gaaaagtgca gccaatcaat 1560 cgtgttttat tggctcggta aaatccaata taggaaactg tgaatctgca gcagggacnt 1620 gccagtatta gcaaagtatt gctacaaatg aaacatgggc aaatagtgcc gtccttgcat 1680 tcaaaagaac tgaatcccaa tattgatttt tcagcaactc cctttgtggt taaccaagaa 1740 ctgcgcgatt ggcagagacc gctgattgat ggaaaaacag tgccgagagt tgcgggtgtc 1800 ttttcatttg gggcaggtgg ttccaatngc nttacgtggt gattgaagag tatattgcga 1860 agataccgac aaataacacc agggaatcta taaaccatag gtctattatt ccattatcag 1920 cacgaactgc tgagcagttg cggcaaattg ccagtagatt gctggcattt attgaaaaga 1980 acaagcaaga cagcgtggtt acccccttaa tagatattgc ttatacattg caggtaggac 2040 gcgaagcaat ggatgaacgc ttggggttta ttgtgagttc aacccgatga attagtcgaa 2100 gaactacgaa gatatcttca aacacacgat gatatggaag agctttatcg aggtcaggtt 2160 aatcgatatg aagacacctt tcttactatg gcggctggat ggaagatctc tcttgaggct 2220 atcccaccca tttgggatta aaaaacgaaa aactggtctt aagtttaatg ccaattattt 2280 gggatttaaa aggggtcttt gtggatttaa wttkgggrkr agwtatassw tkkyttmcca 2340 aargrkgwtw ktccycsgcr matkarmkka ytacctrtcc yttyggcrgs matattttta 2400 rgwtkktamm swtyrnmccc tcwtwcctyt ttktgrcccc agggnccaaa tttattttng 2460 tttgngggga atttngtttt aaaaaagaat tcggttaanc ccacctnccn ttaaactttc 2520 attttggggg gnaatgggtt ttattggnaa cccattccna aaaccaaaaa ngggcctttt 2580 ttttttccat tccnaaaaaa accaaatttt ggcccctttt ttgggggggg gaaaaaaaaa 2640 acccnaangg ggaaaaattn tttttaaaaa aa 2672 35 2132 DNA Endobugula sertula misc_feature (1)...(2132) N refers to any nucleotide. 35 nnnanntttc cnattccctt gggcggaaat tttttgccca gggnccgnat aaccaaagga 60 ccctttttcn ggccccttaa aaaaacccaa tttnccccnt ttaatccccc cgaataaaag 120 aacctttccc aaaaaaaggg naanttgaan tggggggnan cntgggaaat cccaagccaa 180 aaaaaggccc aaymtcgccc waraacrkkc cawwaatsss gawaasmcyy ccagawarwa 240 ttkwtkrrwa mwrawcyagy wwmscamatc rgrtgttwta tggrrsssrg wmyawwtraa 300 aarymytcca wyktkttkss grrtcaatka tgssrkwtyy tcaaymttgg gactcmcyym 360 tcmmmwwttt gaaaaccmyw attatakktr taagsgggcc aaataatcaa tgttggatat 420 ggttaamccg ataaaaaaaa gcctcaataa attttnctgc caacaactaa gacagctcta 480 caataaacat aaaagcaata atgagtccct gtgattattt cccatgaaaa aaacaatggc 540 attttaatag atagatctca tactgaatcg aatattgcca ttataggtat atcagggtgt 600 tttccggatg caaaaaatgt taatgaattt tgggaaaatt taaaaaatgc tcgtcatagt 660 gttaaagaaa ttccctataa ccggtcttgg gatattgata attactttga tacttcttcg 720 caaacacatg cacaggaata tgttaaacaa ggagcatttt tagaaaatat cgatcttttt 780 gatccgctgt tttttaatat ttctccggtg gaagcagagc ttatggatcc aactgaacga 840 tttttccttc aggaatcctg gaaagcgatt ganangatgc tggttatgat gcatcaaact 900 ntaagtggaa aacgntntgg ggggtatttg cctgtgcaaa gggagactac catgccatta 960 ttcacaagca ggataaaact cgtatcatga ccactgactc tatgcctcct gccaggtttg 1020 cttatttatt gaatttgnnt tagggcctgc agttcacgtt gatancnggc ttgttcatcn 1080 gtctttggca gcaattgctt acgcatgtga tagcctcatt cttagaaatt gtgatgttgc 1140 cattgcagga ggtggaaata tcaactcaac tcccagcctt ttgatcagtt caagtcaact 1200 tggtttgttg tcaaaagatg gccgatgtta tgccttsdat caacgtgcaa acggaacggt 1260 attaggggag gcggtascat cgattatttt aaaaccctta caacaagcga ttgacgatgg 1320 tgatcaggtc tacggattaa ttaagggttg gggaatgaat caaratggaa aaaccaatgg 1380 tmttactgct cctagtgtta agtcacaaat tcakttggaa acggatgttt atcaaaaatt 1440 tatgatwaat cctgaacata ttackatggt tsmagcccat ggaactggga ctaaactasg 1500 agatcccatt gaggytcagg cattamcaga agcttttcas aaatatacty aaaaaacakg 1560 gtmttgtgca ctagngttct ttraaaarwa aatattggac atacnttttt cccgctgctg 1620 graktckcta gatgttaatm aagggttttg ttgtccattt cwcancatty acmargwttc 1680 yytycrtart twwtaattyw maarstatna mttwttcaww attcctatyg tnaawwaccc 1740 ywattttkkw ktaaaamcag cycatwwttw wyyssskgtm attwwnyycc nctttwttrw 1800 wmcccmmytt gcgrrcsgtt tttttcgtkk ktgtttcrwc akagaatctm mmsycctttt 1860 ytygcmmmma anmrnnttaa acmmmtwrcc tttytttrgr kggsgycccc cncccngggg 1920 gaanccccca antgggtccc cnnttttggg gggggggntt tngnnaangn aaaatttttt 1980 tttcatgccc nnanaaaagg tccttccgca acctttttta aaaaataanc ccntccccna 2040 aaaanttggg natttgggan tgggaattaa aaaggcccct tttttacccc cccgngttta 2100 attttaattc cccccttttt tggttccggg cc 2132 36 2169 DNA Endobugula sertula misc_feature (1)...(2169) N refers to any nucleotide. 36 nnaccaattt tccgaaaccc aagncatttt gaaaggggtt tttggggccc ggggttgaaa 60 aaaaaaangg ggttttttgg cccccccccc nnagnaanta aaaatgggta aggaacncgc 120 ccccccactt tggaaaacct tccccnaaaa aaaataaaaa ggcntttgga attttttaac 180 naaaatnncg ggggntgggc cntttaaana acccccccnt ttncaaaaaa tgcgarrggk 240 gggyctccwr rnaytyyaaw awgramgsgk tawytmccwa ktgrggggwn ttwtatcawt 300 aaaggnssgg ggktytawkw tttawraarr ggragcttta graawawaaw arwcmgtkgk 360 ktttaaraga rattkwwaar rraactggrw traaktwwww rwrttatwat anaaatrkkw 420 aakggwwrta tagagggaaa aaaatttaaa ggataaatga argaaaccca tcwccattta 480 ttttccaaga sgaccaaaga aatgatagaa gttgttaaat ttatggrtgc gtaaaaagaa 540 attttcccaa awttttaawt yctttgggtt aaaggattaa acmcttgrtt ggaagcaatt 600 atatggtaaa gaacmtccag ctcgtattag tttgccawgc tatccttttg ccaaagagcg 660 gttattggtt ggatactgat aagttagtcg acggtagtta tytcaaccct agrcaagagg 720 gaatwaatac agatagtgat aagtttgatg aaaagcttta tgaatccttg ttggacaatc 780 ttttttccaa aactatgacm cctgatgaag ctattaagtt aatggaagag gaggtatcat 840 gaaaaaatta attaaattga tttatgaaaa agtttttgaa aataaactat caaaatcaga 900 agccttgtcg ttgattagtg gattgaaggc gagcaatact actatccttc atccccttat 960 acatgaaaac acgtcaagtt tttttgaaaa aaaattcagt tcaacttttt ctggtagaga 1020 atttttcttt cggatagatg ctaaccttaa aaaaagtgta ttatctcctg taacatacct 1080 tgaaatggtt tatgctgcag caacaaaggc aatggctggt gagaaatttt cagcgcaatn 1140 ttaaaaaaat tgagtggcaa tatccagcta ttgttcatga agagtcgata acagttcata 1200 ttcgtttttt taaagatcca aatacctggt tggatacaag tgaggagaaa tttttatgct 1260 atcaaattta cacaatttca aataatcaag aaacanangc gatattgttc acaaccgggg 1320 tgtaatagat tatgatcata aaaatagtga attaagtcca cttgatattt tttcactaca 1380 aaagcatatc agtgaatatt ttctagaccc taaagaggat agtgattttt ttgaaaagag 1440 cgataaaagt aatgagccct attatcagag tattgaattg ttacatatta attttcagaa 1500 agaagcgctt ataaaattat cgtttgatca cgtatcagga tacatataac catcaagagt 1560 cattggtttt acatccagat atactggagt tggctttaca atcctgtagc ttcttatgcc 1620 ttgatatggc agatactgga atctgagttt ttcgggggag ttgcagccca gtgagtggta 1680 gatgctttta tcaaatncat gtctcggctg gtccagggac ctcaaatggt gggktttggg 1740 ttaccggctt aacarsyttc catggaaggg tagggnttaw atagscrcan tatttggccy 1800 tkggtgrtgg aatrawrgtw atkcskgggg wccwgstamw wagggttggg ttytcaaaac 1860 cawawraamm skgtttyttg rrkwwttttt tssmmmmgcc scnaaattng aaccccccnn 1920 ngngtaaanc cccnngaaat tnntnttttt tttttncccc gnnccccaan cnnagaaang 1980 aacctttncg nggttttggg caattaaatt taattagggc aaaccccccn ttaatnggaa 2040 ggggggncca nttgggnggt ttttttngga aaaaggaagg gnaaattggg gnnaaaaagg 2100 cccccccaaa nttnggtttt aaaaagggga aaaaaaaatn aaccgtttaa aaaaattnnc 2160 ccccaaant 2169 37 8380 DNA Endobugula sertula misc_feature (1)...(8380) N refers to any nucleotide. 37 gcaccgttgg aacgttatgg catcgattca ttgattgtga ttcaggtgaa tcaggcgttg 60 gcggctattt ttgatgcgct gcctaaaaca ctgttatttg aatatcaaac gatagacgcg 120 gtcgtggctt acttggttga gcagcaccgc caggcatgta gggtgtggac ggggttaacg 180 gcaacgggtc aagctcaaag agagggtgtc atctcctcta cctcatcagc gggtgttgaa 240 cctgtgacac cgagacagaa agagggtcat cctatacaga aagacatcaa gtgccgagaa 300 cacccagtga cagacgagcc tatagccatt attggtctga gtggacatta tccgcaagcg 360 aatagtttgg atgcgtattg ggaaaacttg aaggcaggaa aagattgtat tcgtgaaatt 420 cccgatgacc gttggtcgct agacggtttt ttccatgaag atgttgaaga agcgattgcg 480 caagggaaaa gttacagtaa atggggcggt tttttagagg gatttgctga ttttgaccct 540 ctctttttta acctatcgcc gcgagaggtg atgacgatcg atccacagga gcgtttgttt 600 ttacagagtg cgtgggaagc tgtggaggat gccggttatc gcgtgctcag cttgcttcgc 660 agtttaacaa gcgtgtgggt gtatttgcgg gtattaccaa gacgggtttt gatttttatg 720 gaatacaatc ggatcsagct sbtytnycgc wtnatacttc ctnttackcc aggtttaaaa 780 rgccwmgwtc agctntkttt tsgggttttt taabthhgcg ggkgggtktt ttkvsccvwa 840 tnagcancsg dcggtttttk mattttttta wtggraanac nncaatcggg atcaacntct 900 ttntccgctt atacttcctt tagctcagtg gnnctnaatc gtgtgtcttt attttttggg 960 tttacaaggc ccaagtcntg tnctattgat accatgtgct cctcatcttt gacggcaata 1020 catgaagcct gcgagcatct gcatcgccaa cgatgtgaac tggctattgc ggggggagtg 1080 aatctttatt tgncaccctt caacctatat tagattgtgt actttacgga tgctttccaa 1140 agagggcctg tgcaaaagct ttggttatgg tggtaatggg tttgtaccgg gagnaggggg 1200 ttggcgctgt gttgttgaaa cccttgnntc tagagccatt caggatcagg atagtatata 1260 tgccattatt agagggagtt gtgttaatca tggtggcaaa accaatggtt atactgtgcc 1320 taatccacat tctcanaggc gatcttantt cgtgaagctt tggantaaag ctcangngtt 1380 aantgcccgt atnggtcagt tatatagaag ccncatggta canggtacag agttgggtga 1440 ccncaataga ggtaagaggc ttaacgcaag ccttntcaac aagatactga tgatgttggt 1500 ttttgtgtat ntggngttca gttaaatcta natattggtc atcntggaag ctgccgctgg 1560 tatcgctggg ctgagcanaa gttattctgc agatgaagta tgaaaaaata gtggcaagcc 1620 tacatgcaga aagactgaat gccaatataa attttgaaca aactcctttt gttgttcagc 1680 aatcacttaa tgaatgggaa agaccaaacc ttcatgttaa tggaaaaatc aaagaatatc 1740 ctaggaccgc ggggatctct tcttttggtg cgggagggac gaatgcacat ataataatac 1800 aggagtatat tccagaagtc agtcagacac gacaatcaga ggtcaggaat aaaccagctc 1860 acccggtggc cattctgcta tctgcgcata cttccgctca gttactgaag atggccgagg 1920 cacttttact atttattcgt accatagtga ataatatgga ctcatcctat tcggcagggg 1980 atgagatgac tcacttggta aatgtagcct atacattaca ggttggacgt gaagctatgc 2040 aggaacgcct ggggtttgtt gtgaattccc tgagtgatat tgaagtgaaa ctacaaaaat 2100 ttattgataa ggaaaatgat attgaagact tttatcggga tcaaatcaag actaaaaaag 2160 aaatctcagc tctatttaat tcggatgaag atttgcagga agtgattaaa caatggatgc 2220 gacaaaaaaa actatccagg cttttgtcac tttgggttaa gggagttcac tgtgattgga 2280 acttcttgta tcaacatatg cgaaccaaac cttatcggtt acatttacca acgtacccat 2340 ttgcttataa tcgatattgg attgatgata ataataaaaa tcaatcgact gtagttgaaa 2400 aaaccaacac tattattaaa gagagaaaag agcaagttag attagagccg cttgatttta 2460 tggaaaggaa aaaacttaat gtccatgaaa aaaagccatt tcattgttct ttatcaactc 2520 aatcagaggc ctggtccggg gcgaacactc agacatccag tggtaaacaa agacgatctt 2580 atgtacaggt gcttaaacaa gacgatatat taagggatct taaatcagcg ctgcctacag 2640 ctgttgaagg tatgatacca acattaaatc gaactggtgt catgacagaa agcttaagct 2700 cctactcaga agcatttgca aactatgctg gtatgtgtgg tggagaagta ttggacttgg 2760 ggtgtgccta tggaattgca acgattgcag cgttggagcg aggggctcaa gtattagccg 2820 tagatatgga ggcacagcat ctggaaatat tatcagaccg tattcgggat gaagtgaagt 2880 cgcgtttatc gacacaagta ggcaagttgc tggatcttca ttttgatcaa gaacgttttg 2940 ctgcgatcca tgcgagccga gtgctacact ttttaaaccc acaggatttc cagcaagcat 3000 tacaaaaaat gtatggctgg ttaaaacccg gaggaaaatt atttattgtg acggataccc 3060 cttatatggg ttattgggcg agcaaagcag gggtttatga aactcgtaaa gcagcagggg 3120 atttatggcc aggctacata gataatgttg gttctcactt taatactaaa gagatagaag 3180 gggccccaac tctgatcaac ccgatggacc cggaaatact gcatcgtgaa tgcaaaaaat 3240 ttggttttca tgtagaagag actgtttttt ttgcaggaga agcctttgca ctaaataata 3300 gtttagaaaa atcaggtaga gagcatgttg gtataatagc attgaagccg gaattggaag 3360 attccgacag gcttgagaaa tcgctattgc cagtacggaa aactgaaacg gagaataagg 3420 aaattagcct actgcaaata cagacaatgc ttagggagag tcttgaattt gaattggata 3480 tagagcccgg tatgttggat gagttaaaac cttttacaga tttagggttg gactcgataa 3540 atggagtcac ctggatacga aaaatcaata gtcactatgg attatctatg actgcgacga 3600 aagtatatga ttacccaaat attattgagt tggcagagtt tttaagaaaa caaattattt 3660 cgaatgatga aaagcagcat caaccatcta tatcaacaat atttcccact tcattggatg 3720 aattattgaa aaaaatacaa gaaggtactt tagggattga agaagccgac caattaattg 3780 atgaactacc tgattaccat ctagatatgg aactccatga gttgttataa gggaaagcga 3840 ggtatttttg tgtcacaccg atggatggta aaaccatttt ggctgaaaag aatttagctc 3900 aaatcggcgc agctttgctg cgtccgagtg atttgacttg ttatggtgaa ctcaactatg 3960 cttgtacggc atttccttac ataagtaggt gaaaaatgga aacaattagt gtaaaccaat 4020 ttagagacaa tttgaaaagt tttgtagaac aagcagttag cacgcatgag ccaattaaag 4080 taacgcgcag agccagtgag gctttcgtcg tgataagtgc cgatgattgg gagcaagaac 4140 aggaaagcct ttatattttt cagaatagtg atttgatgca acaaattgca gattcgcttg 4200 gtacgcatac tcagggcaag ggatacaaac caacggataa tgagttgaat gaaatcactg 4260 gtgcttgaag gccatacctg ggaaaactgg gaaaagcttt gcgagcaaga taagcggtta 4320 cacaaggcgt tatgcaaact actcaaagaa atgcttcact cggaagatct aacctccgga 4380 ttaggtaaac ctgagccgct taagcataac ttatctggct tatggtctcg gcgcatttcg 4440 caaaaagacc gactgatata tcgctttatt ttcgctatcg gtggtcacta cgatcaacat 4500 ttagttgcca taacgccata acaagggaaa atatgaagcg cagcggaatc ttttcccttg 4560 tggttacgct tgttataagg ttgtttattc atttagactc cctctgtgtt tactgcaytg 4620 tgtggtagcc agtccagtcc acgttttttg kgggcsrwtt tcaatgtgct tgttatacac 4680 ttagatgtcc gaaaakgraa mccamccmcc attgtatatt tyttttaact caatggataa 4740 atgttttata gctaactgtg aagcttcgat tgcctgattg aactcacgat catttttctc 4800 tgatttttca taaaaggcgt taggtgaaaa tgaagctggt tctgattttt tatgtacagc 4860 tttattcctg aatctaatta aaactttcat atattgatat gcttgctttg atttatcaat 4920 ttcttttcca gtaataattc gtgtgcaaac tagccattta gaaataatat ctaatttatc 4980 taagtgctca acaaccgtat ttgtcagaca aaatgacgag cagaaaaatc wtagactgta 5040 tattcttaaa tacwtagagg acaattwtcm cacaaaagat wtcttgcctc cactgaggct 5100 atttctttyt tgkaatcttt atccctaata ttttcccagc ttagtgacca ataatttata 5160 tcatwmaggt actctgtaag ccgataatac cttttgctta tatcccaata attgggacca 5220 aaaaaagtgc aaaagcgtgg gcgcagatcg agaaatttat tccgttgygg aatagactat 5280 ttgcatcaat tactgctcaa wgccgctgaa aatttctgca aattggtaag ggctttacgt 5340 gttttgtctt gtacawagct gttctattca gcaggagaca aacatggatt agcaagtatg 5400 ggtgtagtta tcactkaaag aaatcattgg cagtatagtc aactcattga aagtcctata 5460 ttaacgtcgc cgaaagttaa atagttttta cgatgagatg taggcattgt gataaatgtg 5520 ctgcacatca tcacaatcat tcagcatatc cataaacctc tcgaacatct taacatcatc 5580 tcccgtcact ggagttgttg tttgaggaat aaattggatt tcgtcgacat crractgaag 5640 cttttcaaag gcttcagata acgcttgctt ggccttaaaa tattcagtat gaggaaccag 5700 tacgctgatc ttaccgtttt ttgcttcaat atcggtgaca tccacatttt ccatcattaa 5760 tgtctccaat acgacttctt cgtcatttcc agtgaaaaca aggattgcac aatgattaaa 5820 catatggcta acactgcctt gggtaccaat cttgcttttg gttttggtaa aacaaatacg 5880 cacatcaccg aaggtgcgat tggggttatc cgttaaacaa tcaakaawam ccatamagtt 5940 cccnaggtcc caaaaccttc ataacggagt gscaawaaag tyttcccmcc mcgcsccccc 6000 tttagctttg tytagggcct tttgaaataa cgtgggcttg gaancttggt ttttttttgc 6060 tttatctatc catactgcgt agagcaagat taccttctgg atcactcccc ccntgatttt 6120 gcacagacat aaattgcgcg accatatttg ctgtagactt tggctttggc atcggaggtt 6180 ttagccattg attctttgcg gttctgatag gctcgaccca ttatataacc cctgattttt 6240 attgaacgaa gagtggattt tacaggtaac tatgagtatg gggaacctgc taatagtmmw 6300 ckwttgtccm wtatymarra ttgcyggtgg ttgtygcttc tgamtaaagc ctcaatattt 6360 gatagattca ctgaatcatt atcattaatg ggtttgataa gtatttataa gaggtttgcg 6420 gtatgatgca gtttgtaatt acctcctccc ccataataat aatgtactgt aaggaaactc 6480 aatgtcttac gattatgatt tgtttgtgat tggtgccggt tctggtggtg tgcgtgcgag 6540 tcgtattgca gcaggccttg gcgctaaagt cgcggtagct gaggatctct ttcttggtgg 6600 tacttgtgtt aatgttggtt gtgtaccaaa aaagcttttg gttctatggg tcacmttttt 6660 ytgaagagtt traascagcc gcaggttttg gttggacaat agggtcatcg tcttttcatt 6720 ggccamcatt acgtgacaat aaracaaaag aaatcgagcg tcttaatggc ggtttatcaa 6780 aacctcttag aaaagtgcgg gagtcgatat tattaatggg cggggcgacc attattgatc 6840 ctcatacgat agcagttggg gacagacagt tttactgctg aacgtatttt agttgctggc 6900 ctgccattcc tgatattcca gggagagaac atattatcag ttnctaacga agtgkkttwt 6960 ckgraagmsk wwmckaaaws srwwgctgtc gtagggggtg gctatattgc tgttgagttt 7020 gcaggtattt ttcaagggtt gggtagtgac attcatttat tgtatcgagg tgatttattt 7080 ctaaggggat ttgatcgaga tgttcgtgaa tttactgcca gtgagatgat aaagaaagga 7140 gtaaatttac attttaatcg cagtgtttct gctattgaaa agcaagtgga tggtagccta 7200 ttagtgggat taactgatgg ctcaaccttg gaagtggata ctattatgta tgccacaggt 7260 ygaaaaccar rmmyygaggs wyytggktyt ksawwrkrsc gctgtmaaas krrckyaaaw 7320 gggaagcctt tycaagtnta actgakaayt tttcaaanca agcagaagcc wbtytawttt 7380 aygcaagtwa ggggawtgtt aatagaccgg tatgncaatk aacvccaagt tgstctsggc 7440 tgaarggtat ggmcttaagc mcagctttta tattagtgac tmcagtggat taataanggt 7500 agattatggg ttttsgttgc cmagaaccgg tttttnttgc caamcccaan tatgggcacc 7560 gtaggttata gtgaagagcg ggccaagrgm wragtttgat acggtgbctg tttadaaatr 7620 gatttttaaa ccagatgaag ncatacgctg agtgncttct tngatngagc ggactttttg 7680 tgaagtnwat tagtagancc aaaacnmcag ataragtcat aggttgtcat atggtaggcg 7740 ctcracgcgg gagaaatctt gntattgcca taaaggcagg agccaccaaa gcagactttg 7800 atagcaccat aggtattcac cctacggttg ccgaagagtt tgtgactatg agagagcctg 7860 cgtatatatt atagcaatag gccaagggca gctacttgtt ttagtaaggc tatttttaca 7920 aatagtacca tcagataata taktgcggta gtttacgttc yamtgaatca kcagtkgtma 7980 wakkagtcat atagcaygms gwrtkatasg kgkattcata yyrtrcawaa syaaykckgt 8040 cgtcgaggga yataatkctc akrataatat wcrttcgasw cctgtysakk cccwaccacr 8100 satacywssc aaagarttgy agtratcrag ckwtgsakws tgamcgntgs matnakgttc 8160 aacgcatgkc ccagcctkat agcatcygac caytsagggc caawrkgmgt taaycccagt 8220 gtwcngttns atrnrsgacs mgktaatggt mggtgwttst wrkawgccsg mtcttmmaaa 8280 mcmsanngmr acgtacaagm rtgwcaccmg krkgcytrya snmattmgct atcamrcnca 8340 yssrrgggkk ggycttmawa arargggcaa aaaaaaaaan 8380 38 1812 PRT Endobugula sertula PEPTIDE (1)...(1810) Corresponds to open reading frame in SEQ ID NO29. 38 Met Val Val Val Glu Glu Phe Phe Val Ser Tyr Arg Asp Ile Leu Lys 1 5 10 15 Ala Leu Gln Asp Glu Lys Ile Ser Phe Glu Glu Ala Lys Tyr Lys Leu 20 25 30 Ile Lys Arg Lys Asp Lys Lys Ser Lys Gln Arg Leu Asn His Asp Arg 35 40 45 Glu Leu Asn Arg Ser Met Asn Ile Thr Pro Lys Ile Val Asn Asn Tyr 50 55 60 Gly Leu Val Leu Leu Gly Gly His Leu Phe Glu Glu Leu Arg Leu Ser 65 70 75 80 Glu Trp Lys Ala Ala Asn Pro Asn Pro Asn Glu Val Ser Ile Gln Val 85 90 95 Lys Ala Ser Ala Ile Ser Phe Thr Asp Thr Leu Cys Val Gln Gly Leu 100 105 110 Tyr Pro Ser His Tyr Pro Phe Val Pro Gly Phe Glu Val Ser Gly Val 115 120 125 Ile Arg Gln Val Gly Glu His Ile Thr Asp Leu His Val Gly Asp Glu 130 135 140 Val Ile Ala Phe Thr Gly Ser Ser Met Gly Gly His Ala Ala Tyr Val 145 150 155 160 Thr Val Pro Gln Asp Tyr Val Val Arg Lys Pro Lys Asp Leu Ser Phe 165 170 175 Glu Asp Ala Cys Ser Phe Pro Leu Ala Phe Ala Thr Val Tyr His Ser 180 185 190 Phe Ala Arg Gly Lys Leu Ser His Asn Asp His Ile Leu Ile Gln Thr 195 200 205 Ala Thr Gly Gly Cys Gly Leu Met Ala Leu Gln Leu Ala Arg Leu Lys 210 215 220 Gln Cys Val Cys Tyr Gly Thr Ser Ser Arg Glu Asp Lys Leu Ala Leu 225 230 235 240 Leu Lys Gln Trp Ala Leu Pro Tyr Val Phe Asn Tyr Lys Thr Cys Asn 245 250 255 Ile Asp Glu Glu Ile Gln Arg Val Ser Gly His Arg Gly Val Asp Val 260 265 270 Val Leu Asn Met Leu Pro Gly Glu His Ile Gln Gln Gly Leu Asn Ser 275 280 285 Leu Ala Lys Gly Gly Arg Tyr Leu Glu Leu Ser Met His Gly Leu Leu 290 295 300 Thr Asn Glu Pro Val Ser Leu Ser Ser Leu Arg Phe Asn Gln Ser Val 305 310 315 320 Gln Thr Ile Asn Leu Leu Gly Leu Leu Asn Lys Gly Asp Asp Gly Phe 325 330 335 Ile Gly Ser Val Leu Ala Gln Met Val Ser Trp Ile Glu Ser Gly Asp 340 345 350 Leu Val Ser Thr Val Ser Arg Ile Tyr Pro Leu Asp Gln Ile Gly Glu 355 360 365 Ala Leu Arg Tyr Val Ser Glu Gly Glu His Ile Gly Lys Val Val Val 370 375 380 Ser His Thr Ala Thr Glu Pro Met Asp Cys Arg Gln Arg Cys Ile Asp 385 390 395 400 Asn Val Leu Lys Gln Gly Gln Met Ala Ala Leu Thr Ala Thr Gly Gly 405 410 415 Lys Ser Arg Val Trp Gly Gly Thr Gly Val Asn Asp Lys Pro Ser Pro 420 425 430 Ala Val Gly Ile Glu Glu Arg Leu Leu Glu Gly Ile Ala Val Ile Gly 435 440 445 Leu Ser Gly Gln Tyr Pro Lys Ser Lys Thr Leu Glu Gln Phe Trp Gln 450 455 460 Thr Leu Ala Asp Gly Val Asp Cys Ile Ser Glu Ile Pro Ala Asp Arg 465 470 475 480 Trp Ser Leu Glu Glu Tyr Tyr Ser Pro Ile Pro Glu Gly Gly Lys Thr 485 490 495 Tyr Cys Lys Trp Met Gly Val Leu Glu Asp Met Asp Cys Phe Asp Pro 500 505 510 Leu Phe Phe Ala Ile Ser Pro Arg Glu Ala Glu Val Met Asp Pro Gln 515 520 525 Gln Arg Leu Phe Leu Glu Asn Ala Trp Ser Cys Ile Glu Asp Ala Gly 530 535 540 Ile Asn Pro Lys Met Leu Ser Arg Ser Arg Cys Gly Val Phe Val Gly 545 550 555 560 Cys Gly Ala Asn Asp Tyr Ser Ala Leu Met Asn Ser Ser His Ser Thr 565 570 575 Ser Leu Glu Leu Met Lys Glu Leu Gly Asn Asn Ser Ser Ile Leu Ser 580 585 590 Ala Arg Ile Ser Tyr Phe Leu Asn Leu Lys Gly Pro Cys Leu Ala Ile 595 600 605 Asp Thr Ala Cys Ser Ser Ser Leu Val Ala Ile Ala Glu Ser Cys Asn 610 615 620 Ser Leu Val Leu Gly Thr Ser Asp Leu Ala Leu Ala Gly Gly Val Leu 625 630 635 640 Leu Met Pro Gly Pro Ser Leu His Ile Gly Leu Ser His Gly Glu Met 645 650 655 Leu Ser Val Asp Gly Arg Cys Phe Thr Phe Asp Gln Arg Ala Asn Gly 660 665 670 Phe Val Pro Gly Glu Gly Val Gly Val Val Leu Leu Lys Arg Met Ser 675 680 685 Asp Ala Val Arg Asp Gly Asp Pro Ile Arg Ala Val Ile Arg Gly Trp 690 695 700 Gly Val Asn Gln Asp Gly Arg Ser Asn Gly Ile Thr Ala Pro Ser Ser 705 710 715 720 Lys Ala Gln Ser Ala Leu Glu Gln Glu Val Tyr Gln Arg Phe Asn Ile 725 730 735 Asp Pro Ser Ser Ile Thr Leu Val Glu Ala His Gly Thr Gly Thr Lys 740 745 750 Leu Gly Asp Pro Ile Glu Val Glu Ala Leu Ala Glu Ser Phe Arg Val 755 760 765 Tyr Thr Asp Lys Arg His Tyr Cys Ala Leu Gly Ser Val Lys Ser Asn 770 775 780 Ile Gly His Leu Gly Val Gly Ala Gly Ile Ala Gly Val Thr Lys Val 785 790 795 800 Leu Leu Ser Leu Gln His Arg Met Leu Pro Pro Thr Ile His Cys Glu 805 810 815 Asp Val Asn Pro Gln Ile Ala Leu Glu Gly Ser Pro Phe Tyr Ile Asn 820 825 830 Thr Glu Leu Lys Pro Trp Gln Ser Gly Asp Ser Ile Pro Arg Arg Ala 835 840 845 Gly Val Ser Ser Phe Gly Phe Ser Gly Thr Asn Ala His Leu Val Leu 850 855 860 Glu Glu Tyr Leu Pro His Ser Thr Gly Thr Ile Glu Ser Phe Ala Ala 865 870 875 880 Asn His Ala Ser Thr Val Ile Ile Pro Leu Ser Ala Lys Ser His Asn 885 890 895 Ser Leu Tyr Thr Tyr Ala Gln Thr Leu Leu Ile Phe Leu Lys Arg Ser 900 905 910 Gln Val Thr Asp Ala Lys Lys Ile Thr Ile Asp His Met Glu Cys Arg 915 920 925 Leu Leu Asp Leu Ala Tyr Thr Leu Gln Val Gly Arg Glu Ala Met Asp 930 935 940 Lys Arg Ile Ser Phe Ile Val Asn Thr Lys Gln Ala Leu Val Glu Lys 945 950 955 960 Leu Asn Ala Phe Leu Glu Lys Glu Lys Thr Ile Thr Asp Cys Tyr His 965 970 975 Tyr Leu Phe Asp Ser Asp Lys Pro Ser Thr Glu Ile Phe Arg Leu Asp 980 985 990 Glu Asp Asp Lys Val Leu Ile Asn Ser Trp Ile Ser Gln Ser Gln Tyr 995 1000 1005 His Lys Leu Ala Glu Ala Trp Ser Gln Gly Leu Asp Ile Asp Trp Thr 1010 1015 1020 Leu Leu Tyr Thr His Ser Ser Thr Pro Arg Arg Ile Ser Leu Pro Thr 1025 1030 1035 1040 Tyr Pro Phe Ala Arg Asp Arg Tyr Trp Leu Pro Glu Lys Pro Arg Tyr 1045 1050 1055 Asn Ala Ala Asn His Pro Val Ser Asn His Gln Thr Thr Thr Gln Asn 1060 1065 1070 His Ser Arg Phe Ala Ile Asp Thr Asp His Asp Val Val Ala Glu Ile 1075 1080 1085 Met Gln Lys Thr His Gln Gln Glu Leu Glu Gln Trp Leu Leu Lys Leu 1090 1095 1100 Leu Phe Val Gln Leu Gln His Met Gly Leu Phe Gln His Arg Val Phe 1105 1110 1115 1120 Glu Thr Ala Thr Ala Leu Arg Gln Ser Ala Gly Ile Val Asp Lys Tyr 1125 1130 1135 Asp Arg Trp Trp His Glu Cys Leu Ser Val Leu Gln Asp Ala Gly Tyr 1140 1145 1150 Leu Glu Trp Lys Asp Asp Ser Val Ala Ala Ala Gln Ala Leu Glu Ser 1155 1160 1165 Glu Ser Gln Glu Ala Trp Trp Ser Arg Trp Asn Thr Glu Tyr Lys His 1170 1175 1180 Tyr Gln Asn Asp Pro Glu Lys Lys Thr Leu Ala Ile Leu Ile Asn Asp 1185 1190 1195 1200 Cys Leu Gln Ala Leu Pro Gly Val Leu Ser Gly Glu Gln Leu Ile Thr 1205 1210 1215 Asp Ile Ile Phe Pro Asn Gly Ser Met Glu Lys Met Glu Gly Leu Tyr 1220 1225 1230 Lys Asn Asn Arg Ile Ala Asp Tyr Cys Asn Gln Cys Val Gly Asp Leu 1235 1240 1245 Leu Val Gln Phe Ile Glu Ala Arg Leu Ser Arg Asp Ala Asn Ala Arg 1250 1255 1260 Ile Arg Ile Ile Glu Ile Gly Ala Gly Thr Gly Gly Thr Thr Ala Ile 1265 1270 1275 1280 Val Leu Pro Met Leu Gln Ala Tyr Gln Asp His Ile Asp Thr Tyr Cys 1285 1290 1295 Tyr Thr Asp Val Ser Lys Ala Phe Leu Met His Gly Gln Glu His Tyr 1300 1305 1310 Gly Glu Gln Tyr Pro Tyr Leu Ser Tyr Cys Leu Cys Asn Ile Glu Gln 1315 1320 1325 Asp Leu Val Ala Gln Gly Ile Ser Val Gly Asp Tyr Asp Ile Ala Ile 1330 1335 1340 Ala Ala Asn Val Leu His Ala Thr Arg Asn Ile His Glu Thr Val Ser 1345 1350 1355 1360 His Val Arg Gln Ala Leu Ala Ala Asn Gly Leu Leu Ile Leu Asn Glu 1365 1370 1375 Phe Ser Gln Lys Ser Val Phe Ser Ser Val Ile Phe Gly Leu Ile Asp 1380 1385 1390 Gly Trp Ala Leu Ser Glu Asp Thr Gly Leu Arg Ile Pro Gly Ser Pro 1395 1400 1405 Gly Leu Tyr Pro Lys Gln Trp Gln Ala Val Leu Glu Ala Ser Gly Phe 1410 1415 1420 Gly Asp Val Glu Phe Pro Leu His Asp Ala Arg Glu Leu Gly Gln Gln 1425 1430 1435 1440 Ile Ile Leu Ala Thr Asn Ala His Ala Asn Val Ala Ser Asp Leu Ala 1445 1450 1455 Thr Ser Val Ile Asp His Ala Pro Lys Arg Leu Pro Ser Ala Glu Val 1460 1465 1470 Ser Met Asp Glu Arg Val Ser His Asp Ala Met Met Lys Ala Ser Val 1475 1480 1485 Lys Gln Leu Leu Val Glu Gln Leu Ser Gln Ser Leu Lys Leu Asp Met 1490 1495 1500 Asn Glu Ile His Pro Asp Glu Ser Phe Ala Asp Tyr Gly Val Asp Ser 1505 1510 1515 1520 Ile Thr Gly Ala Ser Phe Ile Gln Gln Leu Asn Asp Thr Leu Thr Leu 1525 1530 1535 Thr Leu Lys Thr Val Cys Leu Phe Asp His Ser Ser Val Asn Arg Leu 1540 1545 1550 Thr Ala Tyr Leu Leu Ser Asp Tyr Gly Asp Asp Ile Ala Gln Trp Leu 1555 1560 1565 Ala Thr Ala Pro Ala Leu Val Asp His Pro Gln Ser Val Val Ser Gln 1570 1575 1580 Val Leu Pro Glu Arg Ser Pro Ala Ser Thr Gln Ala Lys Pro Leu Pro 1585 1590 1595 1600 Ser Val Pro Pro Ser Leu Ser Met Glu Ser Pro Val Gln Gln Glu Ser 1605 1610 1615 Ile Ala Ile Ile Gly Met Ser Gly Arg Phe Ala Ala Ser Glu Asn Leu 1620 1625 1630 Glu Ala Phe Trp Gln Gln Leu Ala Gln Gly Val Asp Leu Val Glu Pro 1635 1640 1645 Ala Ser Arg Trp Gly Pro Gln Ala Glu Thr Tyr Tyr Gly Ser Phe Leu 1650 1655 1660 Lys Asp Met Asp Gln Phe Asp Pro Leu Phe Phe Asn Leu Ser Gly Val 1665 1670 1675 1680 Glu Ala Ser Tyr Met Asp Pro Gln Gln Arg Cys Phe Leu Glu Glu Ser 1685 1690 1695 Trp Asn Ala Leu Glu Asn Ala Gly Tyr Val Gly Asp Gly Ile Glu Gly 1700 1705 1710 Lys Arg Cys Gly Ile Tyr Ala Gly Cys Val Ser Gly Asp Tyr Ala Gln 1715 1720 1725 Leu Leu Gly Asp Gln Pro Pro Pro Gln Ala Phe Trp Gly Asn Ala Ser 1730 1735 1740 Ser Ile Ile Pro Ala Arg Ile Ala Tyr Tyr Leu Asn Leu Gln Gly Pro 1745 1750 1755 1760 Ala Thr Ala Val Asp Thr Ala Cys Ser Ser Ser Leu Val Ala Val His 1765 1770 1775 Leu Ala Cys Gln Ala Leu His Leu Asp Glu Met Glu Met Ala Leu Ala 1780 1785 1790 Gly Gly Val Ser Leu Tyr Pro Thr Pro Ile Ile Val Glx Val Phe Ala 1795 1800 1805 Trp Cys Arg Tyr 1810 

What is claimed is:
 1. A composition, comprising: at least one isolated nucleic acid molecule that encodes at least one polypeptide that catalyzes at least one step in the synthesis of at least one polyketide or bryopyran ring, wherein said at least one nucleic acid is derived from at least one marine organism.
 2. The composition of claim 1, wherein said at least one bryopyran ring comprises at least one bryostatin.
 3. The composition of claim 1, wherein said at least one polypeptide comprises at least one activity of at least one polyketide synthase.
 4. The composition of claim 1, wherein said at least one marine organism comprises at least one bacteria.
 5. The composition of claim 4, wherein said at least one bacteria comprises at least one Candidatus.
 6. The composition of claim 5, wherein said at least Candidatus comprises at least one Endobugula.
 7. The composition of claim 6, wherein said at least one Endobulgula comprises at least one Endobulgula sertula.
 8. The composition of claim 1, wherein said at least one marine organism comprises at least one invertebrate.
 9. The composition of claim 8, wherein said at least one invertebrate comprises at least one Bugula.
 10. The composition of claim 9, wherein said at least one Bugula is Bugula neritina.
 11. The composition of claim 9, wherein said at least one Bugula is Bugula pacifica.
 12. The composition of claim 1, wherein said at least one nucleic acid molecule further comprises at least one expression control sequence.
 13. The composition of claim 1, wherein said nucleic acid molecule is in a vector.
 14. The composition of claim 13, wherein said vector is within a cell.
 15. The composition of claim 1, wherein said at least one nucleic acid molecule is within a cell.
 16. A composition, comprising a library of nucleic acid molecules of claim
 1. 17. A composition, comprising: at least one isolated polypeptide that catalyzes at least one step in the synthesis of at least one polyketide or bryopyran ring, wherein said at least one polypeptide is derived from at least one marine organism.
 18. The composition of claim 17, wherein said at least one bryopyran ring comprises at least one bryostatin.
 19. The composition of claim 17, wherein said at least one polypeptide comprises at least one activity of at least one polyketide synthase.
 20. The composition of claim 17, wherein said at least one marine organism comprises at least one bacteria.
 21. The composition of claim 20, wherein said at least one bacteria comprises at least one Candidatus.
 22. The composition of claim 21, wherein said at least Candidatus comprises at least one Endobugula.
 23. The composition of claim 22, wherein said at least one Endobulgula comprises at least one Endobulgula sertula.
 24. The composition of claim 17, wherein said at least one marine organism comprises at least one invertebrate.
 25. The composition of claim 24, wherein said at least one invertebrate comprises at least one Bugula.
 26. The composition of claim 25, wherein said at least one Bugula is Bugula neritina.
 27. The composition of claim 25, wherein said at least one Bugula is Bugula pacifica.
 28. The composition of claim 17, wherein said at least one polypeptide is within a cell.
 29. A composition, comprising a library of polypeptides of claim
 17. 30. A method of making a polyketide or bryopyran ring containing composition, comprising: providing the composition of claim 1, and synthesizing a composition therefrom which comprises at least one polyketide or bryopyran ring.
 31. The method of claim 30, wherein said at least one bryopyran ring comprises at least one bryostatin.
 32. A composition made by the method of claim
 30. 33. The composition of claim 32, wherein said composition does not comprise a known bryostatin.
 34. The composition of claim 30, comprising at least one pharmaceutically acceptable carrier.
 35. The composition of claim 30, wherein said composition is a pharmaceutical composition.
 36. A method of making a polyketide or bryopyran ring containing composition, comprising: providing the composition of claim 17, and synthesizing a composition therefrom which comprises at least one polyketide or bryopyran ring.
 37. A composition made by the method of claim
 36. 38. The composition of claim 37, wherein said composition does not comprise a known bryostatin.
 39. The composition of claim 37, comprising at least one pharmaceutically acceptable carrier.
 40. The composition of claim 37, wherein said composition is a pharmaceutical composition.
 41. A method for identifying at least one nucleic acid molecule encoding at least one activity of a PKS, comprising: contacting a nucleic acid molecule of claim 1 with a sample, and identifying nucleic acid molecules in said sample that hybridize with said nucleic acid molecule of claim
 1. 42. The method of claim 41, wherein said sample is derived at least in part from an environmental sample.
 43. The method of claim 42, wherein said environmental sample is derived at least in part from a marine environment.
 44. A nucleic acid molecule identified by the method of claim
 41. 45. The nucleic acid molecule of claim 44, comprising an expression control sequence.
 46. The nucleic acid molecule of claim 44 in a vector.
 47. The nucleic acid molecule of claim 44 in a cell.
 48. A composition comprising a library of nucleic acid molecules of claim
 44. 49. A method for identifying a bioactive compound, comprising: contacting the composition of claim 32 with an in vitro, ex vivo or in vivo detection system and determining the bioactivity of said compound.
 50. A bioactive compound identified by the method of claim
 49. 51. The bioactive compound of claim 50 in a pharmaceutically acceptable carrier.
 52. The bioactive compound of claim 50, wherein said bioactive compound is a pharmaceutical compound.
 53. A method for identifying a bioactive compound, comprising: contacting the composition of claim 37 and determining the bioactivity of said compound.
 54. A bioactive compound identified by the method of claim
 53. 55. The bioactive compound of claim 54 in a pharmaceutically acceptable carrier.
 56. The bioactive compound of claim 54, wherein said bioactive compound is a pharmaceutical compound.
 57. An isolated bacterial symbiont of Bugula, wherein said bacterial symbiont comprises at least one polypeptide that has at least one PKS activity.
 58. An isolated bacterial symbiont of B. neritina or B. pacifica, wherein said bacterial symbiont comprises at least one polypeptide that has at least one PKS activity.
 59. A composition comprising at least one polyketide, bryopyran ring or bryostatin present in Bugula pacifica.
 60. The composition of claim 59, wherein said composition has at least one activity of at least one bryostatin.
 61. The composition of claim 59, wherein said composition is isolated from Bugula pacifica.
 62. The composition of claim 1 which hybridizes under moderate hybridization conditions to any one of SEQ ID NOS. 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 30, 31, 32, 33, 34, 35, 36, 37, or the complement thereof.
 63. The composition of claim 1 which hybridizes under stringent hybridization conditions to any one of SEQ ID NOS. 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 30, 31, 32, 33, 34, 35, 36, 37, or the complement thereof.
 64. An isolated nucleic acid molecule comprising any one of SEQ ID NOS. 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 30, 31, 32, 33, 34, 35, 36, 37, or the complement thereof.
 65. An isolated nucleic acid molecule encoding SEQ ID NOS. 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, or
 38. 